Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • replacing a variable with missing values and creating an addtional varibale by group.

    I have data (DHS) where hv001 hv002 hvidx are the unique identifiers. HV112 is mothers line number (there may more than 1 mother in a HH). These mothers line numbers are against the child. and HVIDX is the line members of HH members. HV108 is education variable. I want create a variable of mother education status as a dummy variable (0 not educated and 1 educated) at the Household level.

  • #2
    Data example using dataex will be helpful. It's been done in another post of yours so I believe you already knew.

    Second, the rule needs to be more clearly stated. Given that there can be more than one mother. If one is educated and the other is not, what would be the HH level indicator? That needs to be explained.

    Comment


    • #3
      Ken Chui Completly missed dataex, Here I am attaching the same for your kind perusal.

      * Example generated by -dataex-. For more info, type help dataex
      clear
      input long hv001 byte(hv002 hvidx hv112 hv108)
      104 38 4 2 2
      104 38 2 . 8
      104 38 1 . 10
      104 42 5 2 0
      104 42 3 2 5
      104 42 4 2 0
      104 42 1 . 9
      104 42 2 . 9
      104 45 6 2 6
      104 45 7 2 6
      104 45 9 2 0
      104 45 5 2 6
      104 45 8 2 1
      104 45 3 . 5
      104 45 2 . 0
      104 45 4 . 9
      104 45 1 . 0
      104 46 3 2 3
      104 46 5 2 0
      104 46 6 2 0
      104 46 4 2 1
      104 46 7 2 0
      104 46 2 . 8
      104 46 1 . 9
      104 55 5 2 3
      104 55 4 2 5
      104 55 7 2 0
      104 55 6 2 2
      104 55 3 2 7
      104 55 1 . 2
      104 55 2 . 0
      104 60 11 2 2
      104 60 9 2 8
      104 60 10 2 3
      104 60 5 4 0
      104 60 6 4 0
      104 60 4 . 0
      104 60 2 . 0
      104 60 1 . 0
      104 60 8 . 9
      104 60 7 . 8
      104 60 3 . 0
      104 62 1 . 0
      104 62 6 . 9
      104 62 4 . 12
      104 62 2 . 0
      104 62 5 . 9
      104 62 3 . 9
      104 64 4 2 0
      104 64 3 2 0
      104 64 2 . 0
      104 64 1 . 8
      104 79 4 2 1
      104 79 5 2 0
      104 79 6 2 0
      104 79 3 2 3
      104 79 1 . 5
      104 79 2 . 8
      104 94 5 2 10
      104 94 3 . 0
      104 94 4 . 8
      104 94 2 . 0
      104 94 1 . 0
      104 96 7 2 0
      104 96 4 2 0
      104 96 5 2 4
      104 96 6 2 3
      104 96 3 . 8
      104 96 1 . 0
      104 96 2 . 0
      105 14 6 . 15
      105 14 7 . 11
      105 14 2 . 0
      105 14 1 . 0
      105 14 4 . 9
      105 14 5 . 0
      105 14 3 . 11
      105 21 5 2 6
      105 21 6 2 4
      105 21 3 . 15
      105 21 2 . 0
      105 21 1 . 0
      105 21 4 . 11
      105 22 3 2 2
      105 22 5 2 0
      105 22 4 2 0
      105 22 6 2 0
      105 22 2 . 0
      105 22 7 . 0
      105 22 1 . 8
      105 22 8 . 0
      105 27 3 2 0
      105 27 4 2 0
      105 27 5 7 0
      105 27 6 . 0
      105 27 7 . 0
      105 27 1 . 9
      105 27 2 . 5
      105 42 4 2 0
      105 42 3 2 0
      end
      label values hv112 HV112
      label values hv108 HV108

      I need a variable which can identify the education of mothers by child (below 5 years of age). We can identify the mother of a child by hv112 variable. As a HH may contain more than one mother we need to distinguish that character as my analysis is at child level.

      Thanks

      Comment


      • #4
        Cross-posted at https://www.reddit.com/r/stata/comme...y_identifiers/

        Please note our policy on cross-posting, which is that should you tell us about it. r/stata have a similar policy: for them it's a rule.

        You didn't get a full answer here, and my guess is that it's because you haven't (obviously) given a rule for distinguishing different mothers.

        Comment


        • #5
          Dear Nick Cox Thanks. I have deleted my post from reddit.

          hv112 is an identifier for mothers. Each variable description is as follows.


          hv001-Cluster Number
          hv002- Household Number
          hvidx - line number of the household members
          hv112 - mother line number against the child in HH.
          hv108 - education of HH members in single years.

          I want to create variable of mothers' education against each child in their respective HH.

          For example, in hv001-104 & hv002-42 we have 5 members and a mother with hvidx 2 whose has studies has 9 years of single year education. if we look at hv112 for this particular case we have three children the line number of mother is repeated 3 times. I want to create a variable against hvidx for line numbers 3,4,5 years of education of mother as 9.

          Thanks

          Comment


          • #6
            Code:
            rename hv001 cluster
            rename hv002 household
            rename hvidx member_num
            rename hv112 mother_num
            rename hv108 member_edu
            
            isid cluster household member_num, sort
            
            frame put cluster household member_num member_edu, into(education)
            frlink m:1 cluster household mother_num, frame(education cluster household member_num)
            frget mother_educ = member_edu, from(education)
            Note: I find it difficult and frustrating to work with variable names like hv112, etc., so I have renamed them to something descriptive. You don't have to do that for this approach to work, so feel free to remove the -rename- commands here and restore the original variable names in the rest of the code.

            Comment


            • #7
              Clyde Schechter Thanks Sir, it worked really well.

              Comment

              Working...
              X