Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping over many variables to create new variable "if any" ==0

    Dear Statalist:
    I am struggling with something that I know should be easy. My data is in long form, where I have multiple year observations for each respondent. I previously reshaped and merged data so that I have information about the respondent's children. I've pasted some of the data for one respondent below as an example. hhidpn is the id for the respondent. Each "kwork" variable corresponds to a child. That is, kwork1 is the first child's work status, kwork2 is the second child, etc. I'd like to create a variable that indicates "any child is not working" for each respondent for each year. Therefore, I need a way to indicate if any kwork* variables==0. I only went up to 8 in the table below, but I actually go up to kwork33, so it would be very helpful to write this as a loop rather than manually. Here, kwork4-kwork8 are missing because this respondent only has 3 children.
    hhidpn year kwork1 kwork2 kwork3 kwork4 kwork5 kwork6 kwork7 kwork8
    10059020 1998 1.work p 2.work f 0.not wo . . . . .
    10059020 2000 2.work f 2.work f 2.work f . . . . .
    10059020 2002 2.work f 2.work f 2.work f . . . . .
    10059020 2004 2.work f 2.work f 0.not wo . . . . .
    10059020 2006 2.work f 2.work f 0.not wo . . . . .
    10059020 2008 2.work f 2.work f 2.work f . . . . .
    10059020 2010 2.work f 2.work f 0.not wo . . . . .
    10059020 2012 2.work f 2.work f 0.not wo . . . . .
    10059020 2014 2.work f 2.work f 0.not wo . . . . .
    So far I have tried:

    Code:
    bysort hhidpn year: gen anyunemployedkid=.
    replace anyunemployedkid=1 if kwork*==0
    replace anyunemployedkid=0 if kwork*==1|kwork*==2
    and I get an error that "kwork* invalid name" (r198).

    I appreciate any help you can provide! Thank you!
    Last edited by Emily Ellis; 19 Jul 2022, 13:36.

  • #2
    Check out egen functions like anycount() and anymatch().

    Comment


    • #3
      There are lots of ways to go about this, as Nick has already alluded. Your data example is insufficient to work with, and despite you saying it is in long format, you actually provide it in wide format. Long format would many many such operations like this easier, but it is not necessary for this task. I have created my own minimal example to show a few variations on the same technique, and it is straightforward to adapt to your own problem.

      The key idea is that if employment status is coded with 0=unemployed, and all other forms of employment coded with a higher number, then all you care about is the minimum value among the set of children to flag the existence of at least one unemployed child. The logic is nice because it can work symmetrically to look for the max (where max may indicate presence of some characteristic). It can also be extended by its logical negation, where the logical negative of having any unemployed children means all children are employed, in your example.

      You don't say what you want to happen to households without children, but presumably you want the indicator to remain missing. If that is not the behaviour you intend, then you can modify the code easily enough.

      Code:
      clear
      input byte(id w1 w2 w3 w4)
      1 0 0 1 .
      2 1 2 1 2
      3 0 0 . .
      4 . . . .
      end
      
      egen byte min_work = rowmin(w1-w4)
      gen byte want1 = min_work==0 if !mi(min_work)
      
      gen byte want2 = .
      foreach v of varlist w1-w4 {
        qui replace want2 = min(want2, `v') if !mi(`v')
      }
      replace want2 = want2==0 if !mi(want2)
      
      assert want1==want2
      drop min_work
      list
      Result

      Code:
      . list
      
           +----------------------------------------+
           | id   w1   w2   w3   w4   want1   want2 |
           |----------------------------------------|
        1. |  1    0    0    1    .       1       1 |
        2. |  2    1    2    1    2       0       0 |
        3. |  3    0    0    .    .       1       1 |
        4. |  4    .    .    .    .       .       . |
           +----------------------------------------+

      Comment


      • #4
        Thank you very much, Leonardo! Your explanation was very helpful and my data is now in the form I needed.

        Comment

        Working...
        X