Reshaping my dataset

Melat Kassa

Join Date: Aug 2018

Posts: 55
#1

Reshaping my dataset

23 May 2020, 21:00

Hello,

I am having trouble reshaping my dataset. I only want two columns: descriptive statistics (id, gender, race, education, union), varx and vary. How would I do that? Also, my id variable isn't important, so I can drop it if necessary. I attached a picture of my dataset.
Tags: None
Sameer Hattiangdi

Join Date: Apr 2020

Posts: 11
#2

23 May 2020, 23:56

You've posted the same question several times within a span of hours. Please review the Stata FAQ on bumping: https://www.statalist.org/forums/help#adviceextras.
1 like
Comment

Justin Niakamal

Join Date: Aug 2017
Posts: 760

24 May 2020, 10:52

Your data doesn't look reshapeable to me because of all of your missing values. Why is GENDER missing for id 3 - 10? Why do 4 of the 10 ids have a race whereas the rest don't? etc. Also, I would think id is important (using reshape you'll need your i() specifier to be unique), and usually an id is a good candidate for that.

Please use dataex to post an example of your data (see the FAQ).

Code:

clear
input byte id long(GENDER race educ) float(varx vary)
 1 2 . .   .3488717   .4476672
 2 1 . .   .2668857   .2973335
 3 . 4 .   .1366463    .457879
 4 . 1 . .028556867 -1.1816216
 5 . 2 .   .8689333   .4960581
 6 . 3 .   .3508549 -.29106066
 7 . . 2  .07110509   .8541681
 8 . . 1  .32336795  1.6088457
 9 . . .   .5551032  -1.899073
10 . . .    .875991 -.59116364
end
label values GENDER egender
label def egender 1 "Female", modify
label def egender 2 "Male", modify
label values race erace
label def erace 1 "black", modify
label def erace 2 "hispanic or latino", modify
label def erace 3 "other", modify
label def erace 4 "white", modify
label values educ eeduc
label def eeduc 1 "College", modify
label def eeduc 2 "No college", modify

For expositional purposes, here's an attempt to reshape your data with only four variables. Even here you'll need to remove missings to get reshape to 'work', the result is a dataset that in my opinion isn't very usable.

Code:

keep GENDER id varx vary
reshape wide var*, i(id) j(GENDER)
variable GENDER contains missing values
r(498);

drop if missing(GENDER)
reshape wide var*, i(id) j(GENDER)

 list, noobs clean

    id      varx1      vary1      varx2      vary2  
     1          .          .   .3488717   .4476672  
     2   .2668857   .2973335          .          .

Hope this helps.

Announcement

Reshaping my dataset

Comment

Comment