Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Recode a variable to the same previous value by category

    Dear community,

    SOS!

    I have a panel data containing household information that was collected mainly in the baseline year. When collecting data, if there were no changes in a period compared to the previous period, the variables are coded 0. Now I want to recode the household head's occupation (HHocc) from 0 to the value of the past period for each household id.
    I was doing it manually but there are about 4000 observations to change so I need a quicker way. Do you have an idea how to do it?

    The following is how data look like:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(qdate HHocc id)
    230 0 47
    231 0 47
    232 1 47
    233 1 47
    234 1 47
    235 1 47
    219 1 48
    220 7 48
    221 0 48
    222 0 48
    223 0 48
    224 0 48
    225 0 48
    226 0 48
    227 0 48
    228 0 48
    229 0 48
    230 0 48
    231 0 48
    232 1 48
    233 1 48
    end
    format %tq qdate
    label values id id
    label def id 47 "a049", modify
    label def id 48 "a050", modify
    Here for example, household 48 should have HHocc=7 from qdate 221 to 231.

    ​​​​​​​I tried to apply this code but it did not work:
    Code:
    recode HHocc (0=1) if (qd==219 & HHocc==1)
    *219 refers to the baseline year and there are about 17 periods.

  • #2
    I think this one line should suffice:
    Code:
    bys id (qdate): replace HHocc = HHocc[_n-1] if HHocc == 0 & !missing(HHocc[_n-1])

    Comment


    • #3
      That's genius! It worked. Thank you so much.

      Now I wanted to adjust the household head's age but the values are not consistent. Since sometimes in the same household there will be both the husband and wife' ages as family head and the age is not increasing with time, how can I adjust it?
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input float(qdate id HHage)
      224 1 49
      225 1 49
      226 1 49
      227 1 49
      219 2 63
      220 2 64
      221 2 64
      222 2 64
      223 2 38
      224 2 38
      225 2 38
      226 2 38
      227 2 38
      228 2 38
      229 2 38
      230 2 38
      231 2 38
      232 2 40
      233 2 40
      234 2 40
      end
      format %tq qdate
      label values id id
      label def id 1 "A106", modify
      label def id 2 "a001", modify

      Comment


      • #4
        You'll need to decide at a conceptual level, how you would like to make the adjustment. In your household 2 for instance, at least part of what seems to be going is that perhaps the head changed (because of a death or migration?) since the age goes from 64 down to 38. Also, some change to a higher age seems to happen in later years -- 63 becomes 64, 38 becomes 40 -- is this because the same person is growing older, or something else?

        Once you decide what adjustment to make, I am happy to help you with the code to achieve that.

        Comment


        • #5
          Oh yes let me explain. So when the data was collected, the researchers did not find the same household head at home so they conducted the survey with the person available at that period who is considered as family head.
          The qdate 219 is actually a whole year (2014). Later on the data became quarterly. So each age should appear 4 times for every household.

          Below is the same data but with a larger range.
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input float(qdate id HHage)
          224 1 49
          225 1 49
          226 1 49
          227 1 49
          219 2 63
          220 2 64
          221 2 64
          222 2 64
          223 2 38
          224 2 38
          225 2 38
          226 2 38
          227 2 38
          228 2 38
          229 2 38
          230 2 38
          231 2 38
          232 2 40
          233 2 40
          234 2 40
          235 2 40
          219 3 75
          220 3 76
          221 3 76
          222 3 76
          223 3 76
          224 3 76
          225 3 76
          226 3 76
          227 3 76
          228 3 70
          229 3 70
          230 3 70
          231 3 70
          232 3 72
          233 3 72
          234 3 72
          235 3 72
          219 4 50
          220 4 51
          end
          format %tq qdate
          label values id id
          label def id 1 "A106", modify
          label def id 2 "a001", modify
          label def id 3 "a002", modify
          label def id 4 "a003", modify
          Nevertheless, the age data still look so messy. I want to do a regression later but I am afraid the results will be biased. Some advised me to fix the HH characteristics over the years because the coefficient of the model will be the average effect, which may help to control the model. But I don't know how to do it.

          Comment


          • #6
            If you just want to fix a characteristic at the level of the first round (2014), then this code will do it:
            Code:
            bysort id (qdate): gen HHage_2014 = HHage[1]

            Comment


            • #7
              I now have a fixed age no matter what the year is. But then, by fixing the age variable, would the results of the regression be consistent?

              Comment

              Working...
              X