Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape with two types of ID variables

    Hi all,
    Is there a way to reshape by individual ID with another ID (household ID) as a variable, as below? I suppose the other option is to reshape the data by the household ID variable but thought I'd check first if there is way around this. Thanks in advance.

    Click image for larger version

Name:	reshape.JPG
Views:	1
Size:	49.2 KB
ID:	1671917


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long idauniq int(hhidw1 hhidw2 hhidw3 hhidw4 hhidw5 hhidw6 hhidw7 hhidw8) long hhidw9
    100001 .     .     .     . 11052     .     .     .      .
    100001 .     . 15553     .     .     .     .     .      .
    100001 .     .     .     .     .     . 15231     .      .
    100001 .     .     . 17241     .     .     .     .      .
    100001 . 11870     .     .     .     .     .     .      .
    100005 .     .     .     .     .     .     .     . 114190
    100005 .     .     .     . 13327     .     .     .      .
    100006 .     .     . 14156     .     .     .     .      .
    100006 .     .     .     .     . 10656     .     .      .
    100006 .     .     .     . 14253     .     .     .      .
    100007 .     .     .     .     . 15127     .     .      .
    100007 .     .     .     . 16347     .     .     .      .
    100007 .     . 14930     .     .     .     .     .      .
    100007 .     .     .     .     .     .     .     . 113179
    100007 . 12585     .     .     .     .     .     .      .
    100007 .     .     .     .     .     . 15750     .      .
    100007 .     .     .     .     .     .     . 11312      .
    100007 .     .     . 16165     .     .     .     .      .
    100009 .     .     . 11245     .     .     .     .      .
    100009 .     .     .     .     .     .     .     . 107657
    100009 .     .     .     .     . 11311     .     .      .
    100009 .     .     .     . 10024     .     .     .      .
    100009 .     . 13922     .     .     .     .     .      .
    100009 .     .     .     .     .     . 11477     .      .
    100010 .     .     . 14772     .     .     .     .      .
    100012 .     .     .     .     .     .     .     . 106459
    100012 .     .     . 12375     .     .     .     .      .
    100012 .     .     .     . 15867     .     .     .      .
    100012 .     .     .     .     .     .     . 12622      .
    100012 .     . 13218     .     .     .     .     .      .
    100016 .     .     . 14710     .     .     .     .      .
    100016 .     . 14068     .     .     .     .     .      .
    100018 .     . 13675     .     .     .     .     .      .
    100018 . 13744     .     .     .     .     .     .      .
    100021 . 13882     .     .     .     .     .     .      .
    100021 .     .     .     .     . 14374     .     .      .
    100021 .     .     .     . 10692     .     .     .      .
    100021 .     . 13043     .     .     .     .     .      .
    100023 .     .     .     .     . 14598     .     .      .
    100023 .     .     .     . 10485     .     .     .      .
    100023 .     .     .     .     .     .     . 10592      .
    100023 .     .     .     .     .     . 11595     .      .
    100023 .     .     .     .     .     .     .     . 105225
    100024 .     .     . 17260     .     .     .     .      .
    100024 .     .     .     . 13368     .     .     .      .
    100024 .     . 15331     .     .     .     .     .      .
    100025 .     .     .     . 15828     .     .     .      .
    100025 .     .     . 14783     .     .     .     .      .
    100025 .     .     .     .     .     .     .     . 113214
    100025 .     .     .     .     . 10153     .     .      .
    end
    ------------------ copy up to and including the previous l

    Many thanks
    Karen

  • #2
    Karen:
    the first thing that I'd do before thinking about -reshape-, is creating an unique categorical varaiable for -hhidw*- and using -label- to differentiate its levels.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      I agree with what I understand Carlo to mean; try:
      Code:
      egen long hhid=rowmax(hhidw*)

      Comment


      • #4
        If it is important to keep track of which of the 9 variables provided the single value in each observation, here is an approach using reshape long.
        Code:
        list if idauniq==100001, clean noobs
        generate seq = _n
        reshape long hhidw, i(seq) j(j)
        drop if hhidw==.
        drop seq
        order idauniq j
        sort idauniq j
        list if idauniq==100001, clean noobs
        Code:
        . list if idauniq==100001, clean noobs
        
            idauniq   hhidw1   hhidw2   hhidw3   hhidw4   hhidw5   hhidw6   hhidw7   hhidw8   hhidw9  
             100001        .        .        .        .    11052        .        .        .        .  
             100001        .        .    15553        .        .        .        .        .        .  
             100001        .        .        .        .        .        .    15231        .        .  
             100001        .        .        .    17241        .        .        .        .        .  
             100001        .    11870        .        .        .        .        .        .        .  
        
        . generate seq = _n
        
        . reshape long hhidw, i(seq) j(j)
        (j = 1 2 3 4 5 6 7 8 9)
        
        Data                               Wide   ->   Long
        -----------------------------------------------------------------------------
        Number of observations               50   ->   450         
        Number of variables                  11   ->   4           
        j variable (9 values)                     ->   j
        xij variables:
                       hhidw1 hhidw2 ... hhidw9   ->   hhidw
        -----------------------------------------------------------------------------
        
        . drop if hhidw==.
        (400 observations deleted)
        
        . drop seq
        
        . order idauniq j
        
        . sort idauniq j
        
        . list if idauniq==100001, clean noobs
        
            idauniq   j   hhidw  
             100001   2   11870  
             100001   3   15553  
             100001   4   17241  
             100001   5   11052  
             100001   7   15231  
        
        .

        Comment


        • #5
          Yet another approach to the problem.
          Code:
          . collapse (firstnm) hhidw*, by(idauniq)
          
          . list, clean noobs
          
              idauniq   hhidw1   hhidw2   hhidw3   hhidw4   hhidw5   hhidw6   hhidw7   hhidw8   hhidw9  
               100001        .    11870    15553    17241    11052        .    15231        .        .  
               100005        .        .        .        .    13327        .        .        .   114190  
               100006        .        .        .    14156    14253    10656        .        .        .  
               100007        .    12585    14930    16165    16347    15127    15750    11312   113179  
               100009        .        .    13922    11245    10024    11311    11477        .   107657  
               100010        .        .        .    14772        .        .        .        .        .  
               100012        .        .    13218    12375    15867        .        .    12622   106459  
               100016        .        .    14068    14710        .        .        .        .        .  
               100018        .    13744    13675        .        .        .        .        .        .  
               100021        .    13882    13043        .    10692    14374        .        .        .  
               100023        .        .        .        .    10485    14598    11595    10592   105225  
               100024        .        .    15331    17260    13368        .        .        .        .  
               100025        .        .        .    14783    15828    10153        .        .   113214
          From this you can also reshape long as in the previous post.

          Thinking about your data, I see that you appear to be preparing a reference that for each individual (idauniq) in each of 9 waves of the survey (w1-w9) will tell what household (hhid) they were in for the given wave.

          Your data suggests that the dataset was created by merging 9 waves of data on idauniq, where each wave had the variables idauniq and one of the hhidw1 ... hhidw9 variables .

          It would have been more direct to have appended the 9 waves of data, each wave containing idauniq, the wave number, and hhid for that wave, giving directly the data structure that results from reshaping.

          In general, this is the most straightforward approach for a collection of single-wave datasets into a Stata panel dataset: make the variable names consistent across the waves as needed, be sure each wave has a wave number, and then append.
          Last edited by William Lisowski; 03 Jul 2022, 08:45.

          Comment

          Working...
          X