Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data with multiple observations per time-point and company

    Hello everyone,

    I have a panel data set (shown below) with quarterly data for a number of companies, identified by id. Each firm at every time points has a number of owners, identified by owner, each holding a number of shares, variable: shares_held. In other words, I have multiple observations per firm-year.

    I try to make my dataset panel data in Stata and make date1 my time variable and id my group id:

    I am using xtset id date1

    However, because I have mutiple observations for each id and date1, I have many duplicates and get:

    repeated time values within panel

    Anybody has recommendations on how to solve this issue?

    Thank you in advance


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long(date owner shares_held) str4 company_id float(share_price date1) long id
    20000630 110 13662 "A"     48.94 161   8
    20000930 110  5944 "A"     48.94 162   8
    20001231 110  5716 "A"     54.75 163   8
    20000930 110 10800 "AES"    68.5 162  70
    20001231 110 10800 "AES"   55.38 163  70
    20010331 110 10800 "AES"   49.96 164  70
    20010630 110 10800 "AES"   43.05 165  70
    19981231 110   300 "AGE"   37.25 155  83
    19990331 110 30011 "ALLY"    4.5 156 126
    19990630 110 30011 "ALLY"   3.75 157 126
    19981231 110  6200 "AOL"     156 155 162
    19990630 110 13300 "AOL"   110.5 157 162
    19990930 110 30000 "AOL"   75.88 158 162
    19991231 110 30000 "AOL"   75.88 159 162
    20000331 110  3600 "AOL"    67.5 160 162
    20000630 110  3600 "AOL"   53.75 161 162
    20010331 110 20400 "AOL"    39.9 164 162
    20010630 110 20400 "AOL"      53 165 162
    20010930 110 35500 "AOL"    33.1 166 162
    20011231 110 35500 "AOL"    32.1 167 162
    20020331 110 49825 "AOL"   23.65 168 162
    20020630 110 56325 "AOL"   14.71 169 162
    20020930 110 56625 "AOL"    11.7 170 162
    20021231 110 56625 "AOL"    13.1 171 162
    20030331 110 56625 "AOL"   10.86 172 162
    20000331 110  3050 "ARBA" 209.63 160 178
    20000630 110  6100 "ARBA" 143.27 161 178
    20000930 110 16000 "ARBA" 143.27 162 178
    20001231 110 10600 "ARBA"  53.63 163 178
    20010331 110 18800 "ARBA"   7.91 164 178
    19981231 110  1764 "ATI"   72.44 155 217
    19981231 110 33256 "BKB"   38.94 155 335
    19990331 110 33256 "BKB"   43.31 156 335
    19990630 110 33256 "BKB"   51.13 157 335
    19981231 110 46748 "BMY"  133.81 155 365
    19990331 110 93296 "BMY"   64.13 156 365
    19990630 110 93296 "BMY"   70.44 157 365
    19990930 110 93296 "BMY"   64.19 158 365
    19991231 110 91716 "BMY"   64.19 159 365
    20000331 110 91716 "BMY"      58 160 365
    20000630 110 91716 "BMY"   57.13 161 365
    20000930 110 92984 "BMY"   57.13 162 365
    20001231 110 91184 "BMY"   73.94 163 365
    20010331 110 91184 "BMY"    59.4 164 365
    20010630 110 91184 "BMY"    52.3 165 365
    20010930 110 91184 "BMY"   55.56 166 365
    20011231 110 91184 "BMY"      51 167 365
    20020331 110 91184 "BMY"   40.49 168 365
    20020630 110 91468 "BMY"    25.7 169 365
    20020930 110 91468 "BMY"    23.8 170 365
    20021231 110 90684 "BMY"   23.15 171 365
    20030331 110 90684 "BMY"   21.13 172 365
    20030630 110 90684 "BMY"   27.15 173 365
    20030930 110 90684 "BMY"   25.66 174 365
    20031231 110 90684 "BMY"    28.6 175 365
    20040331 110 90684 "BMY"   24.23 176 365
    20040630 110 92184 "BMY"    24.5 177 365
    20040930 110 92044 "BMY"   23.67 178 365
    20041231 110 91978 "BMY"   25.62 179 365
    20050331 110 90478 "BMY"   25.46 180 365
    20050630 110 90344 "BMY"   24.98 181 365
    20050930 110 84544 "BMY"   24.06 182 365
    20051231 110 84544 "BMY"   22.98 183 365
    20060331 110 84544 "BMY"   24.61 184 365
    20060630 110 84544 "BMY"   25.86 185 365
    20060930 110 84544 "BMY"   24.92 186 365
    20061231 110 84544 "BMY"   26.32 187 365
    20070331 110 84544 "BMY"   27.76 188 365
    20070630 110 84544 "BMY"   31.56 189 365
    20070930 110 84544 "BMY"   28.82 190 365
    20071231 110 84544 "BMY"   26.52 191 365
    20080331 110 84544 "BMY"    21.3 192 365
    20080630 110 84544 "BMY"   20.53 193 365
    20080930 110 84544 "BMY"   20.85 194 365
    20081231 110 84544 "BMY"   23.25 195 365
    20090331 110 84544 "BMY"   21.92 196 365
    20090630 110 84544 "BMY"   20.31 197 365
    20090930 110 84544 "BMY"   22.52 198 365
    20091231 110 84544 "BMY"   25.25 199 365
    20100331 110 83544 "BMY"    26.7 200 365
    20100630 110 83544 "BMY"   24.94 201 365
    20100930 110 83544 "BMY"   27.11 202 365
    20101231 110 83544 "BMY"   26.48 203 365
    20110331 110 83544 "BMY"   26.43 204 365
    20110630 110 83544 "BMY"   28.96 205 365
    20110930 110 83544 "BMY"   31.38 206 365
    20111231 110 83544 "BMY"   35.24 207 365
    20120331 110 83544 "BMY"   33.75 208 365
    20120630 110 82844 "BMY"   35.95 209 365
    20120930 110 82844 "BMY"   33.75 210 365
    20121231 110 82844 "BMY"   32.59 211 365
    19981231 110  5400 "BNI"   34.25 155 368
    20040930 110  5550 "BNI"   38.31 178 368
    20041231 110  5550 "BNI"   47.31 179 368
    20050331 110  5550 "BNI"   53.93 180 368
    20050630 110  5550 "BNI"   47.08 181 368
    20050930 110  5550 "BNI"    59.8 182 368
    20051231 110  5550 "BNI"   70.82 183 368
    20060331 110  5550 "BNI"   83.33 184 368
    20060630 110  5550 "BNI"   79.25 185 368
    end
    format %tq date1
    label values id id
    label def id 8 "A", modify
    label def id 70 "AES", modify
    label def id 83 "AGE", modify
    label def id 126 "ALLY", modify
    label def id 162 "AOL", modify
    label def id 178 "ARBA", modify
    label def id 217 "ATI", modify
    label def id 335 "BKB", modify
    label def id 365 "BMY", modify
    label def id 368 "BNI", modify
    Last edited by Ralph Klijn; 11 Apr 2018, 03:24.

  • #2
    I found it somewhere else:

    I am using egen panelid = group (id owner), label

    And next i use xtset panelid date1

    And I get quarterly panel data

    Thanks

    Comment


    • #3
      Does anybody know how I am able to generate a lagged variable from this data set?

      I seems I am unable to do it when I sort using xtset panelid_1 date1

      Thanks in advance

      Comment


      • #4
        I suspect that you can generate such a variable. It's just that the results are useless.

        Your "dates" are not fit for purpose. For example, as you go from 20000630 to 20000701 there is a gap of 71, not 1, and as you go from 20001231 to 20010101 there is a gap of 770. You, a human being, can look at these "dates" and make sense of them, but Stata just sees integers.

        It's worse than that. You evidently have quarterly dates, which just happen to be given as the last day of each quarter.

        This is fixable. One fix is to install numdate from SSC and use two of its commands.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input long date
        19981231
        19990331
        19990630
        19990930
        19991231
        20000331
        20000630
        20000930
        20001231
        20010331
        20010630
        20010930
        20011231
        20020331
        20020630
        20020930
        20021231
        20030331
        20030630
        20030930
        20031231
        20040331
        20040630
        20040930
        20041231
        20050331
        20050630
        20050930
        20051231
        20060331
        20060630
        20060930
        20061231
        20070331
        20070630
        20070930
        20071231
        20080331
        20080630
        20080930
        20081231
        20090331
        20090630
        20090930
        20091231
        20100331
        20100630
        20100930
        20101231
        20110331
        20110630
        20110930
        20111231
        20120331
        20120630
        20120930
        20121231
        end
        .
        . numdate daily ddate = date, pattern(YMD)
        note: date is numeric; converting contents to string
        
        . convdate quarterly qdate = ddate
        
        .
        . list
        
             +-------------------------------+
             |     date       ddate    qdate |
             |-------------------------------|
          1. | 19981231   31dec1998   1998q4 |
          2. | 19990331   31mar1999   1999q1 |
          3. | 19990630   30jun1999   1999q2 |
          4. | 19990930   30sep1999   1999q3 |
          5. | 19991231   31dec1999   1999q4 |
             |-------------------------------|
          6. | 20000331   31mar2000   2000q1 |
          7. | 20000630   30jun2000   2000q2 |
          8. | 20000930   30sep2000   2000q3 |
          9. | 20001231   31dec2000   2000q4 |
         10. | 20010331   31mar2001   2001q1 |
             |-------------------------------|
         11. | 20010630   30jun2001   2001q2 |
         12. | 20010930   30sep2001   2001q3 |
         13. | 20011231   31dec2001   2001q4 |
         14. | 20020331   31mar2002   2002q1 |
         15. | 20020630   30jun2002   2002q2 |
             |-------------------------------|
         16. | 20020930   30sep2002   2002q3 |
         17. | 20021231   31dec2002   2002q4 |
         18. | 20030331   31mar2003   2003q1 |
         19. | 20030630   30jun2003   2003q2 |
         20. | 20030930   30sep2003   2003q3 |
             |-------------------------------|
         21. | 20031231   31dec2003   2003q4 |
         22. | 20040331   31mar2004   2004q1 |
         23. | 20040630   30jun2004   2004q2 |
         24. | 20040930   30sep2004   2004q3 |
         25. | 20041231   31dec2004   2004q4 |
             |-------------------------------|
         26. | 20050331   31mar2005   2005q1 |
         27. | 20050630   30jun2005   2005q2 |
         28. | 20050930   30sep2005   2005q3 |
         29. | 20051231   31dec2005   2005q4 |
         30. | 20060331   31mar2006   2006q1 |
             |-------------------------------|
         31. | 20060630   30jun2006   2006q2 |
         32. | 20060930   30sep2006   2006q3 |
         33. | 20061231   31dec2006   2006q4 |
         34. | 20070331   31mar2007   2007q1 |
         35. | 20070630   30jun2007   2007q2 |
             |-------------------------------|
         36. | 20070930   30sep2007   2007q3 |
         37. | 20071231   31dec2007   2007q4 |
         38. | 20080331   31mar2008   2008q1 |
         39. | 20080630   30jun2008   2008q2 |
         40. | 20080930   30sep2008   2008q3 |
             |-------------------------------|
         41. | 20081231   31dec2008   2008q4 |
         42. | 20090331   31mar2009   2009q1 |
         43. | 20090630   30jun2009   2009q2 |
         44. | 20090930   30sep2009   2009q3 |
         45. | 20091231   31dec2009   2009q4 |
             |-------------------------------|
         46. | 20100331   31mar2010   2010q1 |
         47. | 20100630   30jun2010   2010q2 |
         48. | 20100930   30sep2010   2010q3 |
         49. | 20101231   31dec2010   2010q4 |
         50. | 20110331   31mar2011   2011q1 |
             |-------------------------------|
         51. | 20110630   30jun2011   2011q2 |
         52. | 20110930   30sep2011   2011q3 |
         53. | 20111231   31dec2011   2011q4 |
         54. | 20120331   31mar2012   2012q1 |
         55. | 20120630   30jun2012   2012q2 |
             |-------------------------------|
         56. | 20120930   30sep2012   2012q3 |
         57. | 20121231   31dec2012   2012q4 |
             +-------------------------------+
        See https://www.statalist.org/forums/for...date-variables for more on numdate and how to install it.

        As I wrote just yesterday in another thread:

        If you have dates, you need to keep thinking of how Stata handles them. No other way to do it!
        You need now to xtset in terms of quarterly dates.
        Last edited by Nick Cox; 11 Apr 2018, 06:26.

        Comment


        • #5
          Thank you for your reply. Now I am usuing quarterly data (date1), I still get a lot of missing value when creating the lagged variable. Do you know how I overcome this issue? Thank you.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input long(date owner shares_held) str4 company_id float(share_price date1) long id float(panelid_1 lag_share_price)
          20000630 110   13662 "A" 48.94 161 1 1     .
          20000930 110    5944 "A" 48.94 162 1 1 48.94
          20001231 110    5716 "A" 54.75 163 1 1 48.94
          20001231 115   75000 "A" 54.75 163 1 2     .
          20000630 120  122242 "A" 48.94 161 1 3     .
          20000930 120  122842 "A" 48.94 162 1 3 48.94
          20001231 120  126542 "A" 54.75 163 1 3 48.94
          20010331 120  128442 "A" 30.73 164 1 3 54.75
          20010630 120  128342 "A"  32.5 165 1 3 30.73
          20010930 120  128142 "A" 19.55 166 1 3  32.5
          20011231 120  128142 "A" 28.51 167 1 3 19.55
          20020630 120   15877 "A" 23.65 169 1 3     .
          20020930 120   16677 "A" 13.06 170 1 3 23.65
          20021231 120   89677 "A" 17.96 171 1 3 13.06
          20030331 120  155877 "A" 13.15 172 1 3 17.96
          20030630 120  153851 "A" 19.55 173 1 3 13.15
          20030930 120   74651 "A" 22.11 174 1 3 19.55
          20031231 120   76751 "A" 29.24 175 1 3 22.11
          20040331 120   87151 "A" 31.63 176 1 3 29.24
          20040630 120  107101 "A" 29.28 177 1 3 31.63
          20040930 120  133151 "A" 21.57 178 1 3 29.28
          20041231 120   32651 "A"  24.1 179 1 3 21.57
          20050331 120   22451 "A"  22.2 180 1 3  24.1
          20050630 120   22651 "A" 23.02 181 1 3  22.2
          20050930 120   16351 "A" 32.75 182 1 3 23.02
          20051231 120   41151 "A" 33.29 183 1 3 32.75
          20060331 120  232951 "A" 37.55 184 1 3 33.29
          20060630 120  232951 "A" 31.56 185 1 3 37.55
          20060930 120  232951 "A" 32.69 186 1 3 31.56
          20061231 120  232951 "A" 34.85 187 1 3 32.69
          20010630 135    8500 "A"  32.5 165 1 4     .
          20010930 135   12400 "A" 19.55 166 1 4  32.5
          20011231 135   12400 "A" 28.51 167 1 4 19.55
          20040331 135  409100 "A" 31.63 176 1 4     .
          20040930 135   30145 "A" 21.57 178 1 4     .
          20050930 135  100800 "A" 32.75 182 1 4     .
          19991231 140  546600 "A" 77.31 159 1 5     .
          20000331 140  990500 "A"   104 160 1 5 77.31
          20000630 140 1138500 "A" 48.94 161 1 5   104
          20001231 140 4814900 "A" 54.75 163 1 5     .
          20010630 140    7000 "A"  32.5 165 1 5     .
          20020331 140 1888200 "A" 34.96 168 1 5     .
          20020630 140 2686500 "A" 23.65 169 1 5 34.96
          20020930 140   43800 "A" 13.06 170 1 5 23.65
          20021231 140   43800 "A" 17.96 171 1 5 13.06
          20030930 140 2873900 "A" 22.11 174 1 5     .
          20031231 140 5513300 "A" 29.24 175 1 5 22.11
          20040331 140 6565400 "A" 31.63 176 1 5 29.24
          20040630 140 7435022 "A" 29.28 177 1 5 31.63
          20040930 140 9173269 "A" 21.57 178 1 5 29.28
          20041231 140 9173269 "A"  24.1 179 1 5 21.57
          20050331 140 9173269 "A"  22.2 180 1 5  24.1
          20050630 140 9173269 "A" 23.02 181 1 5  22.2
          20050930 155  150400 "A" 32.75 182 1 6     .
          20070331 155  116218 "A" 33.69 188 1 6     .
          20110331 155    1507 "A" 44.78 204 1 6     .
          19991231 165  429500 "A" 77.31 159 1 7     .
          20000331 165  429500 "A"   104 160 1 7 77.31
          20000930 165  477360 "A" 48.94 162 1 7     .
          20000630 185    6163 "A" 48.94 161 1 8     .
          20001231 185   24603 "A" 54.75 163 1 8     .
          20010331 185   22296 "A" 30.73 164 1 8 54.75
          20010630 185  164013 "A"  32.5 165 1 8 30.73
          20010930 185  321279 "A" 19.55 166 1 8  32.5
          20011231 185  158996 "A" 28.51 167 1 8 19.55
          20020331 185  329154 "A" 34.96 168 1 8 28.51
          20020630 185  347614 "A" 23.65 169 1 8 34.96
          20020930 185  347614 "A" 13.06 170 1 8 23.65
          20021231 185  359277 "A" 17.96 171 1 8 13.06
          20030331 185  358763 "A" 13.15 172 1 8 17.96
          20030630 185  370227 "A" 19.55 173 1 8 13.15
          20030930 185  372689 "A" 22.11 174 1 8 19.55
          20031231 185  383589 "A" 29.24 175 1 8 22.11
          20040331 185  378299 "A" 31.63 176 1 8 29.24
          20040630 185  386060 "A" 29.28 177 1 8 31.63
          20040930 185  385582 "A" 21.57 178 1 8 29.28
          20041231 185  384867 "A"  24.1 179 1 8 21.57
          20050331 185  348142 "A"  22.2 180 1 8  24.1
          20050630 185  353284 "A" 23.02 181 1 8  22.2
          20050930 185  430634 "A" 32.75 182 1 8 23.02
          20051231 185  367191 "A" 33.29 183 1 8 32.75
          20060331 185  563745 "A" 37.55 184 1 8 33.29
          20060630 185  380299 "A" 31.56 185 1 8 37.55
          20060930 185  365574 "A" 32.69 186 1 8 31.56
          20061231 185  366674 "A" 34.85 187 1 8 32.69
          20070331 185  360474 "A" 33.69 188 1 8 34.85
          20070630 185  356474 "A" 38.44 189 1 8 33.69
          20070930 185  352274 "A" 36.88 190 1 8 38.44
          20071231 185  357474 "A" 36.74 191 1 8 36.88
          20080331 185  344349 "A" 29.83 192 1 8 36.74
          20080630 185  339604 "A" 35.54 193 1 8 29.83
          20080930 185  340249 "A" 29.66 194 1 8 35.54
          20081231 185  343970 "A" 15.63 195 1 8 29.66
          20090331 185  350308 "A" 15.37 196 1 8 15.63
          20090630 185  338539 "A" 20.31 197 1 8 15.37
          20090930 185  330016 "A" 27.83 198 1 8 20.31
          20091231 185  328071 "A" 31.07 199 1 8 27.83
          20100331 185  331550 "A" 34.39 200 1 8 31.07
          20100630 185  332620 "A" 28.43 201 1 8 34.39
          20100930 185  332008 "A" 33.37 202 1 8 28.43
          end
          format %tq date1
          label values id id
          label def id 1 "A", modify

          Comment


          • #6
            I can't see that you have studied my post carefully.

            You will not get valid quarterly dates by just applying a different display format.

            You must convert your numbers that purport to be dates to quarterly dates that Stata understands. There are various ways to do it: I have given you one.

            The method in #5 is nonsense. For example, 20100930 treated as a quarterly date is a quarter about 5 million years into the future and your quarterly dates still have enormous gaps between them and are irregularly spaced.

            Comment


            • #7
              I'm sorry, it's very weird. I have used earlier:

              Code:
              gen date2=date(string(date, "%8.0f"), "YMD")
              gen date1=qofd(date2)
              format date1 %tq
              And now when I view my data in Stata the values for date1 are displayed as: 2001q1, 2002q2, etc. However, above they are shown as numbers: i.e. 161, 162, etc.

              I don't know what went wrong..

              Comment


              • #8
                I see what it is going on, and -- sorry, my bad mistake -- I misread #5 as if you were trying to format date with format %tq. But you weren't. You were applying that to date1.

                Your date1 is fine as a quarterly date. Unformatted you have numbers like 160 -- which is 160 quarters or 40 years after the first quarter of 1960 which is evident when displayed with an appropriate format:

                Code:
                . di %tq 160
                2000q1
                The only way to understand Stata dates is to read

                Code:
                help datetime
                every time you don't understand something. To start with,

                1. A date in Stata is always a number.

                2. A date in Stata makes sense to people only when displayed with a pertinent date display format.

                However, #2 is not always the case as years such as 2000 are numbers any way and don't need a specific date display format

                Comment

                Working...
                X