Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to extract individual unit sub-datasets from a Panel Dataset.

    Hi,

    I have a panel dataset of around 500 units observed 10 times each. That it to mean that while in long format, my data has 5000 observations (rows). I want to generate individual unit datasets from this panel i.e. to end up with 500 data files each for every unit. Look at this MWE data;

    Code:
    id    year    income    age      tax
    1     2000   3453        23      573
    1     2001   7476        24      453
    1     2002   7346        25      846
    2     2000   9898        32      234
    2     2001   6253        33      254
    2     2002   6272        34      745
    3     2000   9032        17      234
    3     2001   8829        18      746
    3     2002   2738        19      234
    I am unable to conceive a loop (for which I have been hoping it can work) which picks up data for unit 1 and saves it as say data1 and then does the same for unit 2 and saving it as data2 etc such that in the end, I have 3 dataset files in my working directory of the form;

    Data 1

    Code:
    id    year    income    age      tax
    1     2000   3453        23      573
    1     2001   7476        24      453
    1     2002   7346        25      846
    Data 2

    Code:
    id    year    income    age      tax
    2     2000   9898        32      234
    2     2001   6253        33      254
    2     2002   6272        34      745
    Data 3

    Code:
    id    year    income    age      tax
    3     2000   9032        17      234
    3     2001   8829        18      746
    3     2002   2738        19      234

    Could anyone please assist me?

    Thanks

  • #2
    This will do it

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(id year income age tax)
    1 2000 3453 23 573
    1 2001 7476 24 453
    1 2002 7346 25 846
    2 2000 9898 32 234
    2 2001 6253 33 254
    2 2002 6272 34 745
    3 2000 9032 17 234
    3 2001 8829 18 746
    3 2002 2738 19 234
    end
    
    
    tempfile data
    save `data'
    forval i=1/3{
    use `data'
    keep if id==`i'
    save dataset`i'
    clear
    }

    Comment


    • #3
      Thank you Andrew,

      This works. However, what would I do if the lname does not follow a logical sequence? My group variable in the loop is the case unique identifier. This number is random and so including a range such as i = 1/3 in the forval loop generates empty dataset files for which the lname is not a valid unique identifier. Here is my exact code:

      Code:
      preserve 
       forval i = 1/1056{
               keep if pid == `i'
               save pid`i', replace
               restore, preserve 
       }

      Comment


      • #4
        You can generate an alternative identifier using egen(group) and use this for the loop. Here is an example

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input float(id year)
          29 2013
          29 2014
          29 2015
          29 2016
         562 2013
         562 2014
         562 2015
         562 2016
        1477 2013
        1477 2014
        1477 2015
        1477 2016
        2192 2013
        2192 2014
        2192 2015
        2192 2016
        end
        egen id2= group(id)
        The result

        Code:
        . list, sepby(id)
        
             +-------------------+
             |   id   year   id2 |
             |-------------------|
          1. |   29   2013     1 |
          2. |   29   2014     1 |
          3. |   29   2015     1 |
          4. |   29   2016     1 |
             |-------------------|
          5. |  562   2013     2 |
          6. |  562   2014     2 |
          7. |  562   2015     2 |
          8. |  562   2016     2 |
             |-------------------|
          9. | 1477   2013     3 |
         10. | 1477   2014     3 |
         11. | 1477   2015     3 |
         12. | 1477   2016     3 |
             |-------------------|
         13. | 2192   2013     4 |
         14. | 2192   2014     4 |
         15. | 2192   2015     4 |
         16. | 2192   2016     4 |
             +-------------------+

        Comment


        • #5
          Thanks a lot Andrew Musau . This worked so well.

          Comment

          Working...
          X