Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to count the number of transition status (on and off) in panel data?

    Dear all,
    I have a four-wave panel data with an interval of two years. The variable of interest is work and what I want is to count the transition of working status across years. For example, year 1 has five working people and that is eight in year 2, so I want to know how many working people in year 1 who stop working in year 2 and how many people who do not work in year 1 but start working in year 2. I want to do similar things for year2-year3 and year3-year4. My understanding is that my data has a total of 26 working people in all years, so the working status transition across years should match the total of working people. Any help is much appreciated. Thank you.

    Data
    Code:
    clear
    input int id float(year work)
      1 3 1
      1 5 1
      1 7 0
      5 1 1
      5 3 0
      5 5 1
      5 7 0
     15 1 1
     15 3 0
     15 5 0
     15 7 1
     16 1 0
     16 3 1
     16 5 0
     16 7 1
     19 1 0
     19 3 1
     19 5 0
     19 7 0
     22 1 0
     22 3 1
     22 5 0
     25 1 0
     25 5 1
     25 7 0
     27 1 0
     27 3 1
     27 5 1
     27 7 0
     36 1 0
     43 1 0
     43 3 0
     43 5 0
     43 7 1
     45 1 0
     45 3 1
     45 5 1
     45 7 0
     47 1 1
     47 3 1
     47 5 1
     47 7 1
     48 1 0
     48 3 1
     51 1 1
     51 3 0
     51 5 1
     51 7 1
     55 1 0
     55 3 0
     60 5 0
     60 7 0
     63 1 0
     63 3 0
     63 5 0
     63 7 0
     71 1 0
     71 3 0
     71 5 0
     71 7 0
     73 1 0
     73 3 0
     73 5 0
     79 1 0
     79 3 0
     79 5 0
     79 7 0
     83 1 0
     83 3 0
     83 5 0
     83 7 0
     89 1 0
     89 3 0
     89 5 0
     89 7 0
    107 1 0
    107 3 0
    107 5 0
    107 7 0
    108 1 0
    108 3 0
    108 5 0
    108 7 0
    109 1 0
    109 3 0
    111 1 0
    111 3 0
    112 1 0
    112 3 0
    112 5 0
    113 1 0
    113 3 0
    113 5 0
    113 7 0
    117 1 1
    117 3 0
    117 5 1
    117 7 0
    131 1 0
    131 3 0
    end

  • #2
    Code:
    reshape wide work, i(id) j(year)
    
    egen combo = concat(work1-work7)
    
    * I want to know how many working people in year 1 who stop working in year 2
    
    tab combo if substr(combo, 1, 2) == "10"
    
    * and how many people who do not work in year 1 but start working in year 2
    
    tab combo if substr(combo, 1, 2) == "01"
    Results:

    Code:
    . * I want to know how many working people in year 1 who stop working in year 2
    .
    . tab combo if substr(combo, 1, 2) == "10"
    
          combo |      Freq.     Percent        Cum.
    ------------+-----------------------------------
           1001 |          1       25.00       25.00
           1010 |          2       50.00       75.00
           1011 |          1       25.00      100.00
    ------------+-----------------------------------
          Total |          4      100.00
    
    .
    . * and how many people who do not work in year 1 but start working in year 2
    .
    . tab combo if substr(combo, 1, 2) == "01"
    
          combo |      Freq.     Percent        Cum.
    ------------+-----------------------------------
           01.. |          1       16.67       16.67
           010. |          1       16.67       33.33
           0100 |          1       16.67       50.00
           0101 |          1       16.67       66.67
           0110 |          2       33.33      100.00
    ------------+-----------------------------------
          Total |          6      100.00

    Comment


    • #3
      See also https://www.stata-journal.com/articl...article=dm0034 for @Ken Chui's technique.

      But you don't have to reshape. See also https://journals.sagepub.com/doi/10....36867X20909698
      Last edited by Nick Cox; 26 Oct 2022, 10:16.

      Comment


      • #4
        The following approach using the LONG file format gives a different result than Ken's solution for the number of people not working to working (0-1). I get 7 such individuals where Ken's output shows 6. I cannot immediately see the cause of the discrepancy. Maybe someone else can?

        Code:
        . * Number the records within id
        . bysort id: generate byte record=_n
        
        .
        . generate byte combo = work[_n-1]*10 + work if record==2
        (71 missing values generated)
        
        . label define combolabs 0 "0-0" 1 "0-1" 10 "1-0" 11 "1-1"
        
        . label values combo combolabs
        
        . *ssc install fre // Uncomment line if you need to install -fre-
        . fre combo
        
        combo
        ------------------------------------------------------------
                       |      Freq.    Percent      Valid       Cum.
        ---------------+--------------------------------------------
        Valid   0  0-0 |         16      16.00      55.17      55.17
                1  0-1 |          7       7.00      24.14      79.31
                10 1-0 |          4       4.00      13.79      93.10
                11 1-1 |          2       2.00       6.90     100.00
                Total  |         29      29.00     100.00           
        Missing .      |         71      71.00                      
        Total          |        100     100.00                      
        ------------------------------------------------------------
        --
        Bruce Weaver
        Email: [email protected]
        Version: Stata/MP 19.5 (Windows)

        Comment


        • #5
          I guess the answer to different results lies in incomplete panels. The second year of the panel in the example data is most often 3, but sometimes 5 or even 7 and for one panel not even defined.

          Code:
          clear
          input int id float(year work)
            1 3 1
            1 5 1
            1 7 0
            5 1 1
            5 3 0
            5 5 1
            5 7 0
           15 1 1
           15 3 0
           15 5 0
           15 7 1
           16 1 0
           16 3 1
           16 5 0
           16 7 1
           19 1 0
           19 3 1
           19 5 0
           19 7 0
           22 1 0
           22 3 1
           22 5 0
           25 1 0
           25 5 1
           25 7 0
           27 1 0
           27 3 1
           27 5 1
           27 7 0
           36 1 0
           43 1 0
           43 3 0
           43 5 0
           43 7 1
           45 1 0
           45 3 1
           45 5 1
           45 7 0
           47 1 1
           47 3 1
           47 5 1
           47 7 1
           48 1 0
           48 3 1
           51 1 1
           51 3 0
           51 5 1
           51 7 1
           55 1 0
           55 3 0
           60 5 0
           60 7 0
           63 1 0
           63 3 0
           63 5 0
           63 7 0
           71 1 0
           71 3 0
           71 5 0
           71 7 0
           73 1 0
           73 3 0
           73 5 0
           79 1 0
           79 3 0
           79 5 0
           79 7 0
           83 1 0
           83 3 0
           83 5 0
           83 7 0
           89 1 0
           89 3 0
           89 5 0
           89 7 0
          107 1 0
          107 3 0
          107 5 0
          107 7 0
          108 1 0
          108 3 0
          108 5 0
          108 7 0
          109 1 0
          109 3 0
          111 1 0
          111 3 0
          112 1 0
          112 3 0
          112 5 0
          113 1 0
          113 3 0
          113 5 0
          113 7 0
          117 1 1
          117 3 0
          117 5 1
          117 7 0
          131 1 0
          131 3 0
          end
          
          bysort id (year): gen WORK = strofreal(work) if _n == 1 
          by id: replace WORK = WORK[_n-1] + strofreal(work) if missing(WORK)
          by id: replace WORK = WORK[_N]
          
          by id : gen YEAR = strofreal(year) if _n == 1 
          by id: replace YEAR = YEAR[_n-1] + strofreal(year) if missing(YEAR)
          by id: replace YEAR = YEAR[_N]
          
          egen tag = tag(id)
          
          * from Stata Journal 
          groups WORK YEAR if tag  
          
            +-------------------------------+
            | WORK   YEAR   Freq.   Percent |
            |-------------------------------|
            |    0      1       1      3.33 |
            |   00     13       4     13.33 |
            |   00     57       1      3.33 |
            |  000    135       2      6.67 |
            | 0000   1357       8     26.67 |
            |-------------------------------|
            | 0001   1357       1      3.33 |
            |   01     13       1      3.33 |
            |  010    135       1      3.33 |
            |  010    157       1      3.33 |
            | 0100   1357       1      3.33 |
            |-------------------------------|
            | 0101   1357       1      3.33 |
            | 0110   1357       2      6.67 |
            | 1001   1357       1      3.33 |
            | 1010   1357       2      6.67 |
            | 1011   1357       1      3.33 |
            |-------------------------------|
            |  110    357       1      3.33 |
            | 1111   1357       1      3.33 |
            +-------------------------------+

          Comment


          • #6
            Many thanks Ken, Nick and Bruce. Your solutions are extremely helpful to me. I have an additional question and let's take code from #4 as an example. From the results, I see that 4 people in year1 moved from work --> non-work in year3; 7 people in year1 moved from non-work--> work in year3. So the working transition in year1 and year3 can be summarized as: 5 - 4 + 7 = 8. This results match the information I mentioned in post #1 (5 working people in year1 and 8 working people in year3). However, I find inconsistent results for year3-year5 and year5-year7. I did the following code for year3-year5 and year5-year7
            Code:
            * Year3 - year5
            generate byte combo35 = work[_n-1]*10 + work if record==3
            label values combo35 combolabs
            fre combo35
            
            combo35
            ------------------------------------------------------------
                           |      Freq.    Percent      Valid       Cum.
            ---------------+--------------------------------------------
            Valid   0  0-0 |         12      12.00      52.17      52.17
                    1  0-1 |          3       3.00      13.04      65.22
                    10 1-0 |          5       5.00      21.74      86.96
                    11 1-1 |          3       3.00      13.04     100.00
                    Total  |         23      23.00     100.00          
            Missing .      |         77      77.00                      
            Total          |        100     100.00                      
            ------------------------------------------------------------
            
            
            * Year5 - year7
            generate byte combo57 = work[_n-1]*10 + work if record==4
            label values combo57 combolabs
            fre combo57
            
            combo57
            ------------------------------------------------------------
                           |      Freq.    Percent      Valid       Cum.
            ---------------+--------------------------------------------
            Valid   0  0-0 |          9       9.00      50.00      50.00
                    1  0-1 |          3       3.00      16.67      66.67
                    10 1-0 |          4       4.00      22.22      88.89
                    11 1-1 |          2       2.00      11.11     100.00
                    Total  |         18      18.00     100.00          
            Missing .      |         82      82.00                      
            Total          |        100     100.00                      
            ------------------------------------------------------------
            
            * Summary
            . tab work year
            
                       |                    year
                  work |         1          3          5          7 |     Total
            -----------+--------------------------------------------+----------
                     0 |        23         19         16         16 |        74
                     1 |         5          8          8          5 |        26
            -----------+--------------------------------------------+----------
                 Total |        28         27         24         21 |       100
            You can see that in combo35, the summary of transition could be: 8 - 5 + 3 = 6 and this result does match the total working people in year 5, which is 8 (please see the cross-tab between work and year). Similar inconsistence happens for year5-year7. Is there any advice for this issue? Thank you.
            Last edited by Matthew Williams; 26 Oct 2022, 20:19.

            Comment


            • #7
              I guess I would fillin id year and then let the code show up missings (absent data) whenever they occur.

              Comment

              Working...
              X