Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting the number of times a variable is 'present' in a household for the panel years

    Hello everyone.

    I have panel data collected in 4 years, at a 3-year interval. I have a dummy, for 1 if the desired characteristic (growing maize) is present, 0, otherwise. In my data, some households grew maize in some of the years and not all of the four years, some did not grow any maize at all, and some did in all the four years. I want to keep only those that grew at least once in the four years. Please help me with the command for this.
    Last edited by Glory Sibale; 01 Jun 2022, 04:45.

  • #2
    Glory:
    do you mean something along the following lines?
    Code:
    bysort panelid (timevar): egen wanted=count(growing_maize)
    keep if wanted>=1
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Originally posted by Carlo Lazzaro View Post
      Glory:
      do you mean something along the following lines?
      Code:
      bysort panelid (timevar): egen wanted=count(growing_maize)
      keep if wanted>=1
      The -count()- function of egen counts nonmissing values. Either -total()- or -max()- will do here, assuming the maize variable is a 0/1 indicator. See this FAQ for more: https://www.stata.com/support/faqs/d...ble-recording/

      Code:
      clear
      set obs 5
      gen id=_n
      expand 5
      bys id: gen year= 1990+_n
      set seed 06012022
      gen maize= runiformint(0,1) & inlist(id, 2,4)
      l, sepby(id)
      *START HERE
      bys id: egen tokeep= max(maize)
      keep if tokeep
      l, sepby(id)
      Res.:

      Code:
      . l, sepby(id)
      
           +-------------------+
           | id   year   maize |
           |-------------------|
        1. |  1   1991       0 |
        2. |  1   1992       0 |
        3. |  1   1993       0 |
        4. |  1   1994       0 |
        5. |  1   1995       0 |
           |-------------------|
        6. |  2   1991       1 |
        7. |  2   1992       1 |
        8. |  2   1993       1 |
        9. |  2   1994       0 |
       10. |  2   1995       1 |
           |-------------------|
       11. |  3   1991       0 |
       12. |  3   1992       0 |
       13. |  3   1993       0 |
       14. |  3   1994       0 |
       15. |  3   1995       0 |
           |-------------------|
       16. |  4   1991       1 |
       17. |  4   1992       1 |
       18. |  4   1993       1 |
       19. |  4   1994       1 |
       20. |  4   1995       0 |
           |-------------------|
       21. |  5   1991       0 |
       22. |  5   1992       0 |
       23. |  5   1993       0 |
       24. |  5   1994       0 |
       25. |  5   1995       0 |
           +-------------------+
      
      .
      . *START HERE
      
      .
      . bys id: egen tokeep= max(maize)
      
      .
      . keep if tokeep
      (15 observations deleted)
      
      .
      . l, sepby(id)
      
           +----------------------------+
           | id   year   maize   tokeep |
           |----------------------------|
        1. |  2   1991       1        1 |
        2. |  2   1992       1        1 |
        3. |  2   1993       1        1 |
        4. |  2   1994       0        1 |
        5. |  2   1995       1        1 |
           |----------------------------|
        6. |  4   1991       1        1 |
        7. |  4   1992       1        1 |
        8. |  4   1993       1        1 |
        9. |  4   1994       1        1 |
       10. |  4   1995       0        1 |
           +----------------------------+
      
      .

      Comment


      • #4
        Originally posted by Carlo Lazzaro View Post
        Glory:
        do you mean something along the following lines?
        Code:
        bysort panelid (timevar): egen wanted=count(growing_maize)
        keep if wanted>=1
        Yes Carlo,

        But I did not necessarily want it to
        Code:
        count
        Since counting would then give me a total of the times they have grown maize. Instead, I have used
        Code:
        max
        Thank you very much for your help.

        Comment


        • #5
          Originally posted by Andrew Musau View Post

          The -count()- function of egen counts nonmissing values. Either -total()- or -max()- will do here, assuming the maize variable is a 0/1 indicator. See this FAQ for more: https://www.stata.com/support/faqs/d...ble-recording/

          Code:
          clear
          set obs 5
          gen id=_n
          expand 5
          bys id: gen year= 1990+_n
          set seed 06012022
          gen maize= runiformint(0,1) & inlist(id, 2,4)
          l, sepby(id)
          *START HERE
          bys id: egen tokeep= max(maize)
          keep if tokeep
          l, sepby(id)
          Res.:

          Code:
          . l, sepby(id)
          
          +-------------------+
          | id year maize |
          |-------------------|
          1. | 1 1991 0 |
          2. | 1 1992 0 |
          3. | 1 1993 0 |
          4. | 1 1994 0 |
          5. | 1 1995 0 |
          |-------------------|
          6. | 2 1991 1 |
          7. | 2 1992 1 |
          8. | 2 1993 1 |
          9. | 2 1994 0 |
          10. | 2 1995 1 |
          |-------------------|
          11. | 3 1991 0 |
          12. | 3 1992 0 |
          13. | 3 1993 0 |
          14. | 3 1994 0 |
          15. | 3 1995 0 |
          |-------------------|
          16. | 4 1991 1 |
          17. | 4 1992 1 |
          18. | 4 1993 1 |
          19. | 4 1994 1 |
          20. | 4 1995 0 |
          |-------------------|
          21. | 5 1991 0 |
          22. | 5 1992 0 |
          23. | 5 1993 0 |
          24. | 5 1994 0 |
          25. | 5 1995 0 |
          +-------------------+
          
          .
          . *START HERE
          
          .
          . bys id: egen tokeep= max(maize)
          
          .
          . keep if tokeep
          (15 observations deleted)
          
          .
          . l, sepby(id)
          
          +----------------------------+
          | id year maize tokeep |
          |----------------------------|
          1. | 2 1991 1 1 |
          2. | 2 1992 1 1 |
          3. | 2 1993 1 1 |
          4. | 2 1994 0 1 |
          5. | 2 1995 1 1 |
          |----------------------------|
          6. | 4 1991 1 1 |
          7. | 4 1992 1 1 |
          8. | 4 1993 1 1 |
          9. | 4 1994 1 1 |
          10. | 4 1995 0 1 |
          +----------------------------+
          
          .
          Yes, Andrew. I went to the link you have attached to understand more, and it has worked. I have used the code below, and have finally kept the non-zeros. Thank you very much for your help.


          Code:
          egen farmer = max(maize), by(HHno)

          Comment

          Working...
          X