Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting number of observations per household

    I want to count the number of individuals there per household. I do have an individual id number (pno) and a household id (hhid) number. I have tried the following codes:
    egen tag = tag(pno hhid)
    egen distinct = total(pno), by (hhid)

    However, the distinct variables count the actual pno. So if 3 individuals live in the same household their distinct won't be 3. It would be another number. I was wondering how I would solve for this.

  • #2
    If your data set is organized such that there is only one observation for each combination of pno and hhid:
    Code:
    isid hhid pno, sort
    by hhid (pno): gen wanted = _N
    If your data set can contain more than one observation for a given pno and hhid:
    Code:
    by hhid (pno), sort: gen wanted = sum(pno != pno[_n-1])
    by hhid (pno): replace wanted = wanted[_N]
    In the future, when asking for help with code, please always use the -dataex- command and show example data. Although sometimes, as here, it is possible to give an answer that has a reasonable probability of being correct, this is usually not the case. Moreover, such answers are necessarily based on experience-based guesses or intuitions about the nature of your data. When those guesses are wrong, both you and the person trying to help you have wasted their time as you end up with useless code. To avoid this, a -dataex- based example provides all of the information needed to develop and test a solution.

    Comment


    • #3
      Thanks Clyde, I used the following code and it worked
      sort hhid pno
      by hhid: generate n1 = _n
      by hhid: generate n2 = _N

      However, I have another question of interest. Based on the hhid I am able to see the age of each individual within each household. I am interested in finding a way of removing observations that meet certain conditions. I want to remove teenagers or observations that aren't children from the household. So it would only contain mother and children's. I am happy to include a dataex but not sure how to go about it

      Comment


      • #4
        I am happy to include a dataex but not sure how to go about it
        Start by -use-ing your dataset.

        Now think about the variables that are relevant to your question. Let's say they are called hhid, pno, and age. And there is probably some variable in your data set that indicates whether the person is a father, mother, or child (and maybe some other possibilities). Let me guess that that one is called relation. Then type
        Code:
        dataex hhid pno age relation
        in the Command window.

        Stata will respond in the Results window. The output will look something like this:
        [quote]
        ----------------------- copy starting from the next line -----------------------
        [CODE ]
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str18 make int(price mpg) byte foreign
        "AMC Concord" 4099 22 0
        "AMC Pacer" 4749 17 0
        "AMC Spirit" 3799 22 0
        "Buick Century" 4816 20 0
        "Buick Electra" 7827 15 0
        "Buick LeSabre" 5788 18 0
        "Buick Opel" 4453 26 0
        "Buick Regal" 5189 20 0
        "Buick Riviera" 10372 16 0
        "Buick Skylark" 4082 19 0
        "Cad. Deville" 11385 14 0
        "Cad. Eldorado" 14500 14 0
        "Cad. Seville" 15906 21 0
        "Chev. Chevette" 3299 29 0
        "Chev. Impala" 5705 16 0
        "Chev. Malibu" 4504 22 0
        "Chev. Monte Carlo" 5104 22 0
        "Chev. Monza" 3667 24 0
        "Chev. Nova" 3955 19 0
        "Dodge Colt" 3984 30 0
        "Dodge Diplomat" 4010 18 0
        "Dodge Magnum" 5886 16 0
        "Dodge St. Regis" 6342 17 0
        "Ford Fiesta" 4389 28 0
        "Ford Mustang" 4187 21 0
        "Linc. Continental" 11497 12 0
        "Linc. Mark V" 13594 12 0
        "Linc. Versailles" 13466 14 0
        "Merc. Bobcat" 3829 22 0
        "Merc. Cougar" 5379 14 0
        "Merc. Marquis" 6165 15 0
        "Merc. Monarch" 4516 18 0
        "Merc. XR-7" 6303 14 0
        "Merc. Zephyr" 3291 20 0
        "Olds 98" 8814 21 0
        "Olds Cutl Supr" 5172 19 0
        "Olds Cutlass" 4733 19 0
        "Olds Delta 88" 4890 18 0
        "Olds Omega" 4181 19 0
        "Olds Starfire" 4195 24 0
        "Olds Toronado" 10371 16 0
        "Plym. Arrow" 4647 28 0
        "Plym. Champ" 4425 34 0
        "Plym. Horizon" 4482 25 0
        "Plym. Sapporo" 6486 26 0
        "Plym. Volare" 4060 18 0
        "Pont. Catalina" 5798 18 0
        "Pont. Firebird" 4934 18 0
        "Pont. Grand Prix" 5222 19 0
        "Pont. Le Mans" 4723 19 0
        "Pont. Phoenix" 4424 19 0
        "Pont. Sunbird" 4172 24 0
        "Audi 5000" 9690 17 1
        "Audi Fox" 6295 23 1
        "BMW 320i" 9735 25 1
        "Datsun 200" 6229 23 1
        "Datsun 210" 4589 35 1
        "Datsun 510" 5079 24 1
        "Datsun 810" 8129 21 1
        "Fiat Strada" 4296 21 1
        "Honda Accord" 5799 25 1
        "Honda Civic" 4499 28 1
        "Mazda GLC" 3995 30 1
        "Peugeot 604" 12990 14 1
        "Renault Le Car" 3895 26 1
        "Subaru" 3798 35 1
        "Toyota Celica" 5899 18 1
        "Toyota Corolla" 3748 31 1
        "Toyota Corona" 5719 18 1
        "VW Dasher" 7140 23 1
        "VW Diesel" 5397 41 1
        "VW Rabbit" 4697 25 1
        "VW Scirocco" 6850 25 1
        "Volvo 260" 11995 17 1
        end
        label values foreign origin
        label def origin 0 "Domestic", modify
        label def origin 1 "Foreign", modify
        [/CODE ]
        ------------------ copy up to and including the previous line ------------------
        [quote]

        Now use your mouse to highlight everything between (but not including) "----------------------- copy starting from the next line -----------------------" and "------------------ copy up to and including the previous line ------------------". to your clipboard. Then paste it into your Forum post.


        That's all there is to it.






        Comment


        • #5
          clear
          input long g_hidp byte(g_pno g_sex) int g_dvage byte(g_nch5to15 g_nch10to15 g_nch10 g_n1619abs) long g_pns1pid byte(g_pns1pno g_pns1sex) long g_pns2pid byte(g_pns2pno g_pns2sex g_hhsize g_nkids_dv) float g_fihhmnlabgrs_dv
          68020412 1 2 45 0 0 0 0 -8 0 -8 -8 0 -8 4 0 990.38
          68020412 2 2 20 0 0 0 0 68006127 1 2 -8 0 -8 4 0 990.38
          68020412 3 2 23 -7 -7 -7 -7 68006127 1 2 68020564 4 1 4 0 990.38
          68027212 1 2 78 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
          68040812 1 2 57 0 0 0 0 -8 0 -8 -8 0 -8 1 0 1300
          68047612 2 2 29 0 0 0 0 -8 0 -8 -8 0 -8 2 0 3750
          68054412 1 2 51 0 0 0 0 -8 0 -8 -8 0 -8 2 0 3328.36
          68068012 1 2 45 2 2 1 0 -8 0 -8 -8 0 -8 3 2 0
          68074812 1 2 22 0 0 0 0 -8 0 -8 -8 0 -8 2 0 4124.18
          68081612 1 2 36 2 1 0 0 -8 0 -8 -8 0 -8 3 2 468
          68095212 1 2 78 0 0 0 0 -8 0 -8 -8 0 -8 2 0 2268.17
          68102012 1 2 79 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
          68108812 1 2 43 0 0 0 0 -8 0 -8 -8 0 -8 4 0 0
          68115612 1 2 24 0 0 0 0 -8 0 -8 -8 0 -8 2 0 3414.67
          68122412 1 2 43 1 1 0 0 -8 0 -8 -8 0 -8 4 1 2289.79
          68122412 3 2 18 0 0 0 0 68029927 1 2 68029931 2 1 4 1 2289.79
          68129212 1 2 67 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
          68136012 2 2 28 1 0 0 0 -8 0 -8 -8 0 -8 3 1 3630
          68142812 2 2 35 0 0 0 0 -8 0 -8 -8 0 -8 2 0 9000
          68156412 1 2 46 0 0 0 1 -8 0 -8 -8 0 -8 2 0 450.67
          68156412 2 2 17 0 0 0 0 68037407 1 2 -8 0 -8 2 0 450.67
          68170012 1 2 45 2 2 0 0 -8 0 -8 -8 0 -8 4 2 6933.33
          68176812 2 2 44 1 0 0 0 -8 0 -8 -8 0 -8 3 1 7263
          68190412 1 2 39 3 1 1 0 -8 0 -8 -8 0 -8 5 3 1192.3
          68197212 1 2 68 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
          68204012 1 2 53 0 0 0 1 -8 0 -8 -8 0 -8 3 0 5888.33
          68210812 2 2 70 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
          68217612 1 2 42 2 2 0 0 -8 0 -8 -8 0 -8 3 2 1416
          68238012 2 2 55 0 0 0 0 -8 0 -8 -8 0 -8 3 0 9081.17
          68251612 2 2 47 0 0 0 0 -8 0 -8 -8 0 -8 2 0 7535
          68265212 2 2 50 2 2 0 0 -8 0 -8 -8 0 -8 4 2 787
          68265892 1 2 20 0 0 0 0 -8 0 -8 -8 0 -8 1 0 206.67
          68272012 2 2 66 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
          68285612 1 2 25 0 0 0 0 -8 0 -8 -8 0 -8 2 0 2663.56
          68292412 2 2 42 1 0 0 0 -8 0 -8 -8 0 -8 3 1 5268.08
          68306012 1 2 48 1 1 0 0 -8 0 -8 -8 0 -8 3 1 913
          68312812 1 2 45 1 1 0 0 -8 0 -8 -8 0 -8 3 1 5100.58
          68326412 2 2 56 0 0 0 0 -8 0 -8 -8 0 -8 2 0 1900
          68333212 1 2 28 2 0 0 0 -8 0 -8 -8 0 -8 4 2 3040.16
          68340012 2 2 48 0 0 0 0 -8 0 -8 -8 0 -8 4 0 7403.33
          68340012 4 2 23 0 0 0 0 68068007 1 1 68068011 2 2 4 0 7403.33
          68360412 2 2 56 -7 -7 -7 -7 -8 0 -8 -8 0 -8 2 0 8300.94
          68380812 1 2 66 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
          68394412 1 2 26 0 0 0 0 -8 0 -8 -8 0 -8 4 2 0
          68408012 1 2 65 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
          68414812 1 2 62 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
          68428412 1 2 18 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
          68428412 2 2 19 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
          68435212 1 2 20 0 0 0 0 -8 0 -8 -8 0 -8 4 0 1342.33
          68435212 3 2 51 0 0 0 0 -8 0 -8 -8 0 -8 4 0 1342.33
          68435212 4 2 19 0 0 0 0 68442808 3 2 -8 0 -8 4 0 1342.33
          68455612 1 2 66 0 0 0 0 -8 0 -8 -8 0 -8 1 0 750
          68469212 2 2 61 0 0 0 0 -8 0 -8 -8 0 -8 4 0 1621.08
          68476012 1 2 48 0 0 0 0 -8 0 -8 -8 0 -8 2 0 2360.19
          68489612 1 2 34 1 0 0 0 -8 0 -8 -8 0 -8 3 1 6633.33
          68496412 2 2 29 1 0 0 0 -8 0 -8 -8 0 -8 4 2 5304.89
          68503212 1 2 32 2 1 0 0 -8 0 -8 -8 0 -8 4 2 9183
          68510012 1 2 75 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
          68516812 1 2 65 0 0 0 0 -8 0 -8 -8 0 -8 2 0 2166.67
          68516812 2 2 35 0 0 0 0 68120367 1 2 -8 0 -8 2 0 2166.67
          68530412 1 2 43 0 0 0 0 -8 0 -8 -8 0 -8 4 1 4788.2
          68530412 3 2 19 0 0 0 0 68121047 1 2 68121051 2 1 4 1 4788.2
          68537212 1 2 62 0 0 0 0 -8 0 -8 -8 0 -8 1 0 990
          68544012 1 2 52 0 0 0 1 -8 0 -8 -8 0 -8 2 0 4392
          68557612 1 2 49 0 0 0 1 -8 0 -8 -8 0 -8 3 0 2933.33
          68557612 3 2 19 0 0 0 0 68125127 1 2 -8 0 -8 3 0 2933.33
          68571212 2 2 57 0 0 0 0 -8 0 -8 -8 0 -8 2 0 2826.67
          68578012 2 2 47 0 0 0 0 -8 0 -8 -8 0 -8 3 0 6619
          68578012 3 2 19 0 0 0 0 68132607 1 1 68132611 2 2 3 0 6619
          68584812 1 2 40 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
          68591612 2 2 61 0 0 0 0 -8 0 -8 -8 0 -8 2 0 800
          68598412 1 2 78 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
          68605212 1 2 54 1 1 0 0 -8 0 -8 -8 0 -8 4 2 5552.59
          68612012 1 2 23 0 0 0 0 -8 0 -8 -8 0 -8 3 2 78
          68618812 1 2 39 2 2 0 0 68142135 3 1 -8 0 -8 5 2 3502.11
          68625612 2 2 39 1 1 1 0 -8 0 -8 -8 0 -8 3 1 9750
          68632412 2 2 68 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
          68639212 2 2 56 0 0 0 0 -8 0 -8 -8 0 -8 3 0 4666.67
          68652812 1 2 63 0 0 0 0 -8 0 -8 -8 0 -8 3 0 1297.49
          68659612 1 2 58 0 0 0 0 -8 0 -8 -8 0 -8 2 0 3833.33
          68680012 1 2 46 0 0 0 0 -8 0 -8 -8 0 -8 1 0 5906.67
          68693612 1 2 50 0 0 0 0 -8 0 -8 -8 0 -8 2 0 3953
          68693612 2 2 21 0 0 0 0 68157767 1 2 -8 0 -8 2 0 3953
          68700412 2 2 35 1 1 1 0 -8 0 -8 -8 0 -8 3 1 4800
          68707212 2 2 32 0 0 0 0 -8 0 -8 -8 0 -8 2 0 3030.83
          68714012 1 2 72 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
          68720812 1 2 60 0 0 0 0 -8 0 -8 -8 0 -8 2 0 2097.14
          68727612 1 2 37 2 1 0 0 -8 0 -8 -8 0 -8 4 2 3125
          68734412 1 2 58 0 0 0 0 -8 0 -8 -8 0 -8 2 0 4104
          68748012 1 2 30 0 0 0 0 -8 0 -8 -8 0 -8 4 2 1001.32
          68768412 1 2 59 0 0 0 0 -8 0 -8 -8 0 -8 3 0 1461.06
          68782012 1 2 45 3 1 0 0 -8 0 -8 -8 0 -8 5 3 5037
          68802412 2 2 40 1 0 0 0 -8 0 -8 -8 0 -8 4 2 6833.33
          68809212 1 2 53 0 0 0 1 -8 0 -8 -8 0 -8 2 0 3500
          68816012 2 2 56 0 0 0 0 -8 0 -8 -8 0 -8 3 0 8350
          68822812 1 2 46 0 0 0 0 -8 0 -8 -8 0 -8 3 0 5450
          68836412 1 2 53 0 0 0 0 -8 0 -8 -8 0 -8 3 0 1065.55
          68836412 3 2 25 0 0 0 0 68191087 1 2 -8 0 -8 3 0 1065.55
          68843212 1 2 43 2 1 0 0 -8 0 -8 -8 0 -8 3 2 1605
          68856812 1 2 62 0 0 0 0 -8 0 -8 -8 0 -8 3 0 499.68
          end
          label values g_pno g_pno
          label values g_sex g_sex
          label def g_sex 2 "female", modify
          label values g_dvage g_dvage
          label values g_nch5to15 g_nch5to15
          label def g_nch5to15 -7 "proxy", modify
          label values g_nch10to15 g_nch10to15
          label def g_nch10to15 -7 "proxy", modify
          label values g_nch10 g_nch10
          label def g_nch10 -7 "proxy", modify
          label values g_n1619abs g_n1619abs
          label def g_n1619abs -7 "proxy", modify
          label values g_pns1pid g_pns1pid
          label def g_pns1pid -8 "inapplicable", modify
          label values g_pns1pno g_pns1pno
          label def g_pns1pno 0 "not in hh", modify
          label values g_pns1sex g_pns1sex
          label def g_pns1sex -8 "inapplicable", modify
          label def g_pns1sex 1 "male", modify
          label def g_pns1sex 2 "female", modify
          label values g_pns2pid g_pns2pid
          label def g_pns2pid -8 "inapplicable", modify
          label values g_pns2pno g_pns2pno
          label def g_pns2pno 0 "not in hh", modify
          label values g_pns2sex g_pns2sex
          label def g_pns2sex -8 "inapplicable", modify
          label def g_pns2sex 1 "male", modify
          label def g_pns2sex 2 "female", modify
          label values g_hhsize g_hhsize
          label values g_nkids_dv g_nkids_dv
          label def g_nkids_dv 0 "none", modify
          label values g_fihhmnlabgrs_dv g_fihhmnlabgrs_dvt
          [/CODE]

          The above is dataex. Where g_pns1pno relates to first parents person number, g_pns1sex first parents gender and g_pns1pid personal identifier

          I was trying to find a way to remove the family members from the household that are not the parents or children in the age from 0 to 15

          Comment


          • #6
            You will need to explain what these variables are. While I can infer that g_hidp and g_pno are household and person identifiers, and g_sex is probably the sex of the person the observation refers to, I cannot interpret the others. I might guess that g_dvage is the person's age, I'm not sure what the dv means. Moreover, if you are talking about eliminating children age 0 to 15 from the data set, at least in the example you show, there aren't any such: the lowest value of g_dvage is 17.

            Then there are these variables g_nch5to15, g_nch10to15, g_nch10, and g_n1619abs that I could imagine being relevant to, perhaps the number of, children in those agegroups. But, there are some with value -7, labeled "proxy", that I am baffled by--is this "magic number" encoding of missing values? If so, what does "proxy" mean? And there are other problems with these. I notice that the g_nch* variables can take on different values for different people in the same household. So I guess g_nch* really doesn't tell us the number of children in that household in these age groups. So what does it tell us?

            Then there are these g_pns#* variables (# = 1 or 2, * = pid, pno or sex): what are these about?

            There is a variable g_hhsize, which I might guess is the household size. And it is consistent within households. But I would expect it to be equal to the number of g_pno's in the household. It isn't. In fact, it hardly ever is the same.

            Finally, I can't even begin to guess what g_fihhmnlabgrs_dv is supposed to be.

            Apart from clearing up these mysteries, to approach this particular problem of removing the family members that are not the parents or children age 0 to 15, you need to tell which variable gives the person's age, and which variable tells us who is a parent. (I could imagine that g_dvage gives the person's age, but I don't see anything that says who is a parent.)

            Comment


            • #7
              i renamed some of the variables to make a little more sense. The dataex is
              clear
              input long g_hidp byte(g_pno g_sex) int age byte(g_nch5to15 g_nch10to15 g_nch10 g_n1619abs) long parent1_pid byte(parent1_pno parent1_sex) long parent2_pid byte(parent2_pno parent2_sex g_hhsize g_nkids_dv) float grosshousehold_income
              68020412 1 2 45 0 0 0 0 -8 0 -8 -8 0 -8 4 0 990.38
              68020412 2 2 20 0 0 0 0 68006127 1 2 -8 0 -8 4 0 990.38
              68020412 3 2 23 -7 -7 -7 -7 68006127 1 2 68020564 4 1 4 0 990.38
              68027212 1 2 78 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
              68040812 1 2 57 0 0 0 0 -8 0 -8 -8 0 -8 1 0 1300
              68047612 2 2 29 0 0 0 0 -8 0 -8 -8 0 -8 2 0 3750
              68054412 1 2 51 0 0 0 0 -8 0 -8 -8 0 -8 2 0 3328.36
              68068012 1 2 45 2 2 1 0 -8 0 -8 -8 0 -8 3 2 0
              68074812 1 2 22 0 0 0 0 -8 0 -8 -8 0 -8 2 0 4124.18
              68081612 1 2 36 2 1 0 0 -8 0 -8 -8 0 -8 3 2 468
              68095212 1 2 78 0 0 0 0 -8 0 -8 -8 0 -8 2 0 2268.17
              68102012 1 2 79 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
              68108812 1 2 43 0 0 0 0 -8 0 -8 -8 0 -8 4 0 0
              68115612 1 2 24 0 0 0 0 -8 0 -8 -8 0 -8 2 0 3414.67
              68122412 1 2 43 1 1 0 0 -8 0 -8 -8 0 -8 4 1 2289.79
              68122412 3 2 18 0 0 0 0 68029927 1 2 68029931 2 1 4 1 2289.79
              68129212 1 2 67 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
              68136012 2 2 28 1 0 0 0 -8 0 -8 -8 0 -8 3 1 3630
              68142812 2 2 35 0 0 0 0 -8 0 -8 -8 0 -8 2 0 9000
              68156412 1 2 46 0 0 0 1 -8 0 -8 -8 0 -8 2 0 450.67
              68156412 2 2 17 0 0 0 0 68037407 1 2 -8 0 -8 2 0 450.67
              68170012 1 2 45 2 2 0 0 -8 0 -8 -8 0 -8 4 2 6933.33
              68176812 2 2 44 1 0 0 0 -8 0 -8 -8 0 -8 3 1 7263
              68190412 1 2 39 3 1 1 0 -8 0 -8 -8 0 -8 5 3 1192.3
              68197212 1 2 68 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
              68204012 1 2 53 0 0 0 1 -8 0 -8 -8 0 -8 3 0 5888.33
              68210812 2 2 70 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
              68217612 1 2 42 2 2 0 0 -8 0 -8 -8 0 -8 3 2 1416
              68238012 2 2 55 0 0 0 0 -8 0 -8 -8 0 -8 3 0 9081.17
              68251612 2 2 47 0 0 0 0 -8 0 -8 -8 0 -8 2 0 7535
              68265212 2 2 50 2 2 0 0 -8 0 -8 -8 0 -8 4 2 787
              68265892 1 2 20 0 0 0 0 -8 0 -8 -8 0 -8 1 0 206.67
              68272012 2 2 66 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
              68285612 1 2 25 0 0 0 0 -8 0 -8 -8 0 -8 2 0 2663.56
              68292412 2 2 42 1 0 0 0 -8 0 -8 -8 0 -8 3 1 5268.08
              68306012 1 2 48 1 1 0 0 -8 0 -8 -8 0 -8 3 1 913
              68312812 1 2 45 1 1 0 0 -8 0 -8 -8 0 -8 3 1 5100.58
              68326412 2 2 56 0 0 0 0 -8 0 -8 -8 0 -8 2 0 1900
              68333212 1 2 28 2 0 0 0 -8 0 -8 -8 0 -8 4 2 3040.16
              68340012 2 2 48 0 0 0 0 -8 0 -8 -8 0 -8 4 0 7403.33
              68340012 4 2 23 0 0 0 0 68068007 1 1 68068011 2 2 4 0 7403.33
              68360412 2 2 56 -7 -7 -7 -7 -8 0 -8 -8 0 -8 2 0 8300.94
              68380812 1 2 66 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
              68394412 1 2 26 0 0 0 0 -8 0 -8 -8 0 -8 4 2 0
              68408012 1 2 65 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
              68414812 1 2 62 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
              68428412 1 2 18 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
              68428412 2 2 19 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
              68435212 1 2 20 0 0 0 0 -8 0 -8 -8 0 -8 4 0 1342.33
              68435212 3 2 51 0 0 0 0 -8 0 -8 -8 0 -8 4 0 1342.33
              68435212 4 2 19 0 0 0 0 68442808 3 2 -8 0 -8 4 0 1342.33
              68455612 1 2 66 0 0 0 0 -8 0 -8 -8 0 -8 1 0 750
              68469212 2 2 61 0 0 0 0 -8 0 -8 -8 0 -8 4 0 1621.08
              68476012 1 2 48 0 0 0 0 -8 0 -8 -8 0 -8 2 0 2360.19
              68489612 1 2 34 1 0 0 0 -8 0 -8 -8 0 -8 3 1 6633.33
              68496412 2 2 29 1 0 0 0 -8 0 -8 -8 0 -8 4 2 5304.89
              68503212 1 2 32 2 1 0 0 -8 0 -8 -8 0 -8 4 2 9183
              68510012 1 2 75 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
              68516812 1 2 65 0 0 0 0 -8 0 -8 -8 0 -8 2 0 2166.67
              68516812 2 2 35 0 0 0 0 68120367 1 2 -8 0 -8 2 0 2166.67
              68530412 1 2 43 0 0 0 0 -8 0 -8 -8 0 -8 4 1 4788.2
              68530412 3 2 19 0 0 0 0 68121047 1 2 68121051 2 1 4 1 4788.2
              68537212 1 2 62 0 0 0 0 -8 0 -8 -8 0 -8 1 0 990
              68544012 1 2 52 0 0 0 1 -8 0 -8 -8 0 -8 2 0 4392
              68557612 1 2 49 0 0 0 1 -8 0 -8 -8 0 -8 3 0 2933.33
              68557612 3 2 19 0 0 0 0 68125127 1 2 -8 0 -8 3 0 2933.33
              68571212 2 2 57 0 0 0 0 -8 0 -8 -8 0 -8 2 0 2826.67
              68578012 2 2 47 0 0 0 0 -8 0 -8 -8 0 -8 3 0 6619
              68578012 3 2 19 0 0 0 0 68132607 1 1 68132611 2 2 3 0 6619
              68584812 1 2 40 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
              68591612 2 2 61 0 0 0 0 -8 0 -8 -8 0 -8 2 0 800
              68598412 1 2 78 0 0 0 0 -8 0 -8 -8 0 -8 1 0 0
              68605212 1 2 54 1 1 0 0 -8 0 -8 -8 0 -8 4 2 5552.59
              68612012 1 2 23 0 0 0 0 -8 0 -8 -8 0 -8 3 2 78
              68618812 1 2 39 2 2 0 0 68142135 3 1 -8 0 -8 5 2 3502.11
              68625612 2 2 39 1 1 1 0 -8 0 -8 -8 0 -8 3 1 9750
              68632412 2 2 68 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
              68639212 2 2 56 0 0 0 0 -8 0 -8 -8 0 -8 3 0 4666.67
              68652812 1 2 63 0 0 0 0 -8 0 -8 -8 0 -8 3 0 1297.49
              68659612 1 2 58 0 0 0 0 -8 0 -8 -8 0 -8 2 0 3833.33
              68680012 1 2 46 0 0 0 0 -8 0 -8 -8 0 -8 1 0 5906.67
              68693612 1 2 50 0 0 0 0 -8 0 -8 -8 0 -8 2 0 3953
              68693612 2 2 21 0 0 0 0 68157767 1 2 -8 0 -8 2 0 3953
              68700412 2 2 35 1 1 1 0 -8 0 -8 -8 0 -8 3 1 4800
              68707212 2 2 32 0 0 0 0 -8 0 -8 -8 0 -8 2 0 3030.83
              68714012 1 2 72 0 0 0 0 -8 0 -8 -8 0 -8 2 0 0
              68720812 1 2 60 0 0 0 0 -8 0 -8 -8 0 -8 2 0 2097.14
              68727612 1 2 37 2 1 0 0 -8 0 -8 -8 0 -8 4 2 3125
              68734412 1 2 58 0 0 0 0 -8 0 -8 -8 0 -8 2 0 4104
              68748012 1 2 30 0 0 0 0 -8 0 -8 -8 0 -8 4 2 1001.32
              68768412 1 2 59 0 0 0 0 -8 0 -8 -8 0 -8 3 0 1461.06
              68782012 1 2 45 3 1 0 0 -8 0 -8 -8 0 -8 5 3 5037
              68802412 2 2 40 1 0 0 0 -8 0 -8 -8 0 -8 4 2 6833.33
              68809212 1 2 53 0 0 0 1 -8 0 -8 -8 0 -8 2 0 3500
              68816012 2 2 56 0 0 0 0 -8 0 -8 -8 0 -8 3 0 8350
              68822812 1 2 46 0 0 0 0 -8 0 -8 -8 0 -8 3 0 5450
              68836412 1 2 53 0 0 0 0 -8 0 -8 -8 0 -8 3 0 1065.55
              68836412 3 2 25 0 0 0 0 68191087 1 2 -8 0 -8 3 0 1065.55
              68843212 1 2 43 2 1 0 0 -8 0 -8 -8 0 -8 3 2 1605
              68856812 1 2 62 0 0 0 0 -8 0 -8 -8 0 -8 3 0 499.68
              end
              label values g_pno g_pno
              label values g_sex g_sex
              label def g_sex 2 "female", modify
              label values age g_dvage
              label values g_nch5to15 g_nch5to15
              label def g_nch5to15 -7 "proxy", modify
              label values g_nch10to15 g_nch10to15
              label def g_nch10to15 -7 "proxy", modify
              label values g_nch10 g_nch10
              label def g_nch10 -7 "proxy", modify
              label values g_n1619abs g_n1619abs
              label def g_n1619abs -7 "proxy", modify
              label values parent1_pid g_pns1pid
              label def g_pns1pid -8 "inapplicable", modify
              label values parent1_pno g_pns1pno
              label def g_pns1pno 0 "not in hh", modify
              label values parent1_sex g_pns1sex
              label def g_pns1sex -8 "inapplicable", modify
              label def g_pns1sex 1 "male", modify
              label def g_pns1sex 2 "female", modify
              label values parent2_pid g_pns2pid
              label def g_pns2pid -8 "inapplicable", modify
              label values parent2_pno g_pns2pno
              label def g_pns2pno 0 "not in hh", modify
              label values parent2_sex g_pns2sex
              label def g_pns2sex -8 "inapplicable", modify
              label def g_pns2sex 1 "male", modify
              label def g_pns2sex 2 "female", modify
              label values g_hhsize g_hhsize
              label values g_nkids_dv g_nkids_dv
              label def g_nkids_dv 0 "none", modify
              label values grosshousehold_income g_fihhmnlabgrs_dvt
              [/CODE]

              hidp is household id, pno is the person number, age, g_nch5to15 g_nch10to15 g_nch10 g_n1619abs are the different children age bands, hhsize is the household size. In addition, parent1_sex, parent1_pno and parent1_pid are the parent 1 identifiers. The same goes for the second parents.

              The variables where proxy, or negative numbers are values which need to be removed and will do that next. But I wanted to keep children from ages 0 to 15 in the dataset. However, there may be children that are older than 15 but are considered as individuals. I hope that makes more sense

              Comment


              • #8
                Thank you. As noted earlier, the example data doesn't contain any children. But it does have a few parents. The following code reduces the data set to the parents (and it would also include children if there were any).

                Code:
                frame put g_hidp g_pno parent*_pno, into(parents)
                frame parents {
                    reshape long parent@_pno, i(g_hidp g_pno)
                    drop if parent_pno == 0
                    drop g_pno _j
                    duplicates drop
                }
                frlink m:1 g_hidp g_pno, frame(parents g_hidp parent_pno)
                gen byte is_parent = !missing(parents)
                drop parents
                frame drop parents
                
                gen byte is_child_0_15 = inrange(age, 0, 15)
                
                keep if is_parent | is_child_0_15

                Comment


                • #9
                  Thank you, I'll see if it runs. I had a question when using the above code was parent*_pno suppose to be parent1_pno or parent2_pno. I wasn't too sure if this was a mistake

                  Comment


                  • #10
                    It's not a mistake. parent*_pno is "wildcard" notation which Stata allows in lists of variable names. Stata expands it to mean both parent1_pno and parent2_pno. For a full explanation of this and other things one can do in lists of variable names, do read -help varlist-.

                    Comment


                    • #11
                      Ah i see it makes sense and it did work. Thank you.

                      I was also wondering if I had separated the sample before into females and males and run the exact same code as previous it should work as well right?

                      Comment


                      • #12
                        I was also wondering if I had separated the sample before into females and males and run the exact same code as previous it should work as well right?
                        Not necessarily. Suppose there is a mother whose only children are sons. When you split the data into males and females, this mother will never be found. She will not be listed as a parent in the female data set, and she won't appear at all in the male data set.

                        Comment


                        • #13
                          Sorry I meant as in splitting the sample into mothers and fathers. As in mothers income may have a different impact on child health compared do a fathers income

                          Comment


                          • #14
                            I wouldn't split the data set beforehand. I would run the code as shown on the full data set. Then, to distinguish mothers from fathers:
                            Code:
                            gen byte is_mother = is_parent & g_sex == "female":g_sex
                            Then if you want do separate analyses of mothers and fathers, you can create two data sets: one is children and mothers, the other is children and fathers.

                            Or, if you are specifically interested in contrasting the effects of mothers' and fathers' income on children's outcomes, use an interaction between is_mother and the income variable.

                            Comment


                            • #15
                              That makes sense, thank you. Say if I do look at the contrasting effects of mothers on children's outcomes would I need to create variables at an aggregate level? For example, if there are two children within that household therefore I should look at the average health of children in the household.

                              If so I could do the sum of the health variable divided by the number of children in the household between the ages 0 and 15.

                              Comment

                              Working...
                              X