Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Coding_column_row

    Dear statalist community, I am in need to do the following exercise for my research, can anyone please offer help?


    [CODE]
    tab Household_composition


    Relationship to head
    |------------------+-----------------------------------
    | Head 1
    | Wife/Husband 2
    | Son/Daughter 3
    | Son -in-Law 4
    | Grandchild 5
    | Father/Mother 6
    | Brother/Sister 7
    | Parent-in-Law 8
    | Nephew/Niece 9
    | Sib-in-Law 10
    | Other rel 11
    |Servant/Others 12

    /CODE]

    In a household, there could be members, as given in the table above. If the household is a nuclear family ( i.e., "Head 1 with Wife/Husband 2 with/without Son/Daughter 3") then we assign 1 to the DUMMY variable. If the household has other members apart from "Head with Wife/Husband 2 with/without Son/Daughter 3", then we assign 0 to the Dummy variable.

    I have a unique identifier for the household in which members live.

    I need to generate this DUMMY variable.
    regards,
    ajay
    Last edited by ajay pasi; 04 Dec 2022, 02:45.

  • #2
    You don't present a data example: after 3+ years on the forum and 72 posts you should know about dataex! If not, then please have a look at https://www.statalist.org/forums/help#stata

    The rules appear to be

    1. If 1 and 2 are present, but not any 4 to 12, assign 1.

    2. Otherwise if 1 and 2 are present, assign 0.

    3. Presence or absence of 3 is immaterial.

    4. Otherwise (presumably), assign missing.


    See https://www.stata.com/support/faqs/d...ble-recording/ for basic technique,

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(hh_id role)
    1 1
    1 2
    2 1
    2 2
    2 3
    3 1
    4 1
    4 2
    4 4
    end
    
    egen any1 = total(role == 1), by(hh_id)
    
    egen any2 = total(role == 2), by(hh_id)
    
    egen anyhigh = total(inrange(role, 4, .)), by(hh_id)
    
    gen wanted = cond(any1 & any2 & anyhigh, 0, cond(any1 & any2, 1, .))
    (1 missing value generated)
    
    list, sepby(hh_id)
    
         +-----------------------------------------------+
         | hh_id   role   any1   any2   anyhigh   wanted |
         |-----------------------------------------------|
      1. |     1      1      1      1         0        1 |
      2. |     1      2      1      1         0        1 |
         |-----------------------------------------------|
      3. |     2      1      1      1         0        1 |
      4. |     2      2      1      1         0        1 |
      5. |     2      3      1      1         0        1 |
         |-----------------------------------------------|
      6. |     3      1      1      0         0        . |
         |-----------------------------------------------|
      7. |     4      1      1      1         1        0 |
      8. |     4      2      1      1         1        0 |
      9. |     4      4      1      1         1        0 |
         +-----------------------------------------------+
    See https://www.stata-journal.com/articl...article=dm0099 for a survey, including some advocacy for the term indicator variable over the all-too-common term dummy variable.

    Comment


    • #3
      Dear Nick, sorry for not providing dataex example. I will mind that in future.

      After running the code provided by you, I am encountering problem in the fourth line of the code


      Code:
       egen any1 = total(role == 1), by(hh_id)
      
      . 
      . egen any2 = total(role == 2), by(hh_id)
      
      . 
      . egen anyhigh = total(inrange(role, 4, .)), by(hh_id)
      
      . 
      . gen wanted = cond(any1 & any2 & anyhigh, 0, cond(any1 & any2, 1, .))
      amp;any2 invalid name
      r(198);

      Comment


      • #4
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input double hh_id float role
        102010101  1
        102010101  2
        102010101  3
        102010101  3
        102010101  3
        102010101  4
        102010101  5
        102010101  5
        102010101  5
        102010201  1
        102010201  3
        102010201  4
        102010201  5
        102010201  5
        102010201  5
        102010201  3
        102010201  4
        102010201  5
        102010201  5
        102010201  5
        102010201  3
        102010201  4
        102010201  5
        102010201  5
        102010201 12
        102010301  1
        102010301  2
        102010301  3
        102010301  3
        102010301  3
        102010301  3
        102010301  3
        102010401  1
        102010401  2
        102010401  3
        102010401  3
        102010401  3
        102010501  1
        102010501  2
        102010501  3
        102010501  4
        102010501  5
        102010501  3
        102010501  4
        102010501  3
        102010501  3
        102010501  3
        102010601  1
        102010601  3
        102010601  4
        102010601  5
        102010601  5
        102010601  3
        102010601  4
        102010601  5
        102010601  5
        102010601  3
        102010601  4
        102010701  1
        102010701  2
        102010701  3
        102010701  3
        102010701  3
        102010701  3
        102010701  3
        102010701  3
        102010801  1
        102010801  2
        102010801  3
        102010801  3
        102010801  3
        102010801  3
        102010801  3
        102010801  3
        102010901  1
        102010901  3
        102010901  4
        102010901  3
        102010901  3
        102010901  3
        102010901  3
        102010901  3
        102010902  1
        102010902  2
        102010902  3
        102010902  3
        102011001  1
        102011001  2
        102011001  3
        102011001  7
        102011001  7
        102011001  7
        102011001  7
        102011001  7
        102011001 10
        102011001  9
        102011001  9
        102011001  9
        102011001  6
        102011101  1
        end

        Comment


        • #5
          Your problem is a consequence of the forum software sometimes showing you HTML rather than the results of rendering HTML.

          &

          is just HTML code for the ampersand -- namely & -- which is what Stata needs in 3 places. Sorry for not spotting that.

          Comment


          • #6

            I am wondering what will be the fourth line of the code (which executes)?

            gen nuclear_more = cond(any1 & any2 & anyhigh, 0, cond(any1 & any2, 1, .)) is the code?
            Last edited by ajay pasi; 04 Dec 2022, 04:59.

            Comment


            • #7
              Sorry, what is the question in #6? If you are unfamiliar with cond(), check out https://www.stata-journal.com/articl...article=pr0016

              Comment


              • #8
                Thank you so much sir, I figured it out.

                Comment


                • #9



                  Follow-up question ( related to Nick's advice-- https://www.statalist.org/forums/for...56#post1691956)

                  Dear Nick, if I also have the age of the role (variable). Then I would like to generate a Dummy that takes value 1 if it is a nuclear family ( i.e., Wife/Husband 2 with/without Son/Daughter 3"). On the other hand, the Dummy takes the value zero if Wife/Husband 2 with/without Son/Daughter 3 live with other adults who are older than both the wife and husband.


                  ----------------------- copy starting from the next line -----------------------
                  Code:
                  * Example generated by -dataex-. For more info, type help dataex
                  clear
                  input double hh_id int(role age)
                  102010101  1 57
                  102010101  2 49
                  102010101  3 19
                  102010101  3 14
                  102010101  3 29
                  102010101  4 26
                  102010101  5  7
                  102010101  5  4
                  102010101  5  2
                  102010201  1 76
                  102010201  3 42
                  102010201  4 40
                  102010201  5 20
                  102010201  5 19
                  102010201  5 13
                  102010201  3 37
                  102010201  4 33
                  102010201  5 13
                  102010201  5  8
                  102010201  5  6
                  102010201  3 31
                  102010201  4 26
                  102010201  5  8
                  102010201  5  3
                  102010301  1 45
                  102010301  2 43
                  102010301  3 23
                  102010301  3 20
                  102010301  3 17
                  102010301  3 14
                  102010301  3 12
                  102010401  1 57
                  102010401  2 47
                  102010401  3 20
                  102010401  3 14
                  102010401  3 13
                  102010501  1 50
                  102010501  2 38
                  102010501  3 27
                  102010501  4 25
                  102010501  5  1
                  102010501  3 23
                  102010501  4 21
                  102010501  3 18
                  102010501  3 16
                  102010501  3 12
                  102010601  1 72
                  102010601  3 39
                  102010601  4 29
                  102010601  5  4
                  102010601  5  1
                  102010601  3 37
                  102010601  4 33
                  102010601  5  7
                  102010601  5  3
                  102010601  3 27
                  102010601  4 24
                  102010701  1 52
                  102010701  2 42
                  102010701  3 26
                  102010701  3 24
                  102010701  3 20
                  102010701  3 18
                  102010701  3 16
                  102010701  3 15
                  102010801  1 42
                  102010801  2 37
                  102010801  3 18
                  102010801  3 14
                  102010801  3 13
                  102010801  3 12
                  102010801  3  3
                  102010801  3  0
                  102010901  1 48
                  102010901  3 21
                  102010901  4 19
                  102010901  3 20
                  102010901  3 17
                  102010901  3 14
                  102010901  3 14
                  102010901  3 12
                  102010902  1 27
                  102010902  2 25
                  102010902  3  4
                  102010902  3  1
                  102011001  1 37
                  102011001  2 30
                  102011001  3  3
                  102011001  7 33
                  102011001  7 31
                  102011001  7 29
                  102011001  7 24
                  102011001  7 22
                  102011001 11 70
                  102011001  9 25
                  102011001  9 19
                  102011001  9 18
                  102011001  6 67
                  102011101  1 31
                  102011101  2 28
                  end
                  label values role role
                  label def role 1 "Head 1", modify
                  label def role 2 "Wife/Husband 2", modify
                  label def role 3 "Son/Daughter 3", modify
                  label def role 4 "Child-in-Law 4", modify
                  label def role 5 "Grandchild 5", modify
                  label def role 6 "Father/Mother 6", modify
                  label def role 7 "Brother/Sister 7", modify
                  label def role 9 "Nephew/Niece 9", modify
                  label def role 10 "Sib-in-Law 10", modify
                  label def role 11 "Other-relatives 11", modify
                  regards
                  ajay
                  Last edited by ajay pasi; 06 Dec 2022, 02:51.

                  Comment


                  • #10
                    I think this should suffice:
                    Code:
                    sort hh_id
                    by hh_id: egen int max_head_age = max(cond(inlist(role,1,2),age,.))
                    by hh_id: egen int max_overall_age = max(age)
                    gen byte nuclear = (max_head_age==max_overall_age)
                    which produces:
                    Code:
                    . format hh_id %9.0f
                    . egen byte tag = tag(hh_id)
                    . li hh_id max_head_age max_overall_age nuclear if tag, noobs sep(0) ab(20)
                      +------------------------------------------------------+
                      |     hh_id   max_head_age   max_overall_age   nuclear |
                      |------------------------------------------------------|
                      | 102010101             57                57         1 |
                      | 102010201             76                76         1 |
                      | 102010301             45                45         1 |
                      | 102010401             57                57         1 |
                      | 102010501             50                50         1 |
                      | 102010601             72                72         1 |
                      | 102010701             52                52         1 |
                      | 102010801             42                42         1 |
                      | 102010901             48                48         1 |
                      | 102010902             27                27         1 |
                      | 102011001             37                70         0 |
                      | 102011101             31                31         1 |
                      +------------------------------------------------------+
                    Last edited by Hemanshu Kumar; 06 Dec 2022, 04:18.

                    Comment


                    • #11
                      So you need the maximum of husband age and wife age:


                      Code:
                      egen hw_age = max(cond(inlist(role, 1, 2), age, .), by(hh_id)
                      and to check that both husband and wife are present

                      Code:
                      egen present1 = total(role == 1), by(hh_id)
                      egen present2 = total(role == 2), by(hh_id)
                      and to check whether there is anybody older

                      Code:
                      egen count_older = total(role > 3 & age > hw_age), by(hh_id)
                      and then the indicator (not dummy please (*)) is I think

                      Code:
                      gen wanted = cond(present1 & present2 & count_older, 0, cond(present1 & present2, 1, .))

                      (*) https://www.stata-journal.com/articl...article=dm0099 Section 2.

                      Comment


                      • #12
                        #10 is a good route. It seems safe to assume that parents are older than their children. But in my code and to some extent in Hemanshu Kumar 's code we are being a little careless about the possibility of missing values. Paranoid code could use the condition


                        Code:
                         
                         role > 3 & role < . & age > hw_age & !missing(age, hw_age) 

                        Comment


                        • #13
                          Thanks, sirs (Hemanshu and Nick), appreciate the help.

                          Comment

                          Working...
                          X