Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with an unbalanced panel

    Whilst declaring my panel and time variables in order to do the synthetic control method, (synth), my panel is showing up as unbalanced but I do not know how/ where- i have expanded a dataset by adding years and their quarters but to my knowledge each naic (panel id) has data for the same years and quarters? In order for the SCM to work the panel needs to be balanced yet when i try to set it i get this :
    xtset naic ts
    panel variable: naic (unbalanced)
    time variable: ts, 1997q1 to 2012q3, but with a gap
    delta: 1 quarter

    Is there anyway for me to know how the panel is unbalanced/ how to fix this?

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double naic float(year yq) double(ts l)
    337121 1997  971 148  95.57533264160156
    337121 1997  972 149  96.86366271972656
    337121 1997  973 150  96.51200103759766
    337121 1997  974 151   97.7453384399414
    337121 1998  981 152  97.62833404541016
    337121 1998  982 153  97.94366455078125
    337121 1998  983 154  97.80799865722656
    337121 1998  984 155  99.89533233642578
    337121 1999  991 156 100.65899658203125
    337121 1999  992 157 102.07233428955078
    337121 1999  993 158 102.20899963378906
    337121 1999  994 159 103.28866577148438
    337121 2000    1 160 103.92900085449219
    337121 2000    2 161 104.42400360107422
    337121 2000    3 162 103.45933532714844
    337121 2000    4 163 101.83000183105469
    337121 2001  101 164  98.24366760253906
    337121 2001  102 165  95.09600067138672
    337121 2001  103 166  91.13400268554688
    337121 2001  104 167  89.53900146484375
    337121 2002  201 168  91.61033630371094
    337121 2002  202 169   92.4786605834961
    337121 2002  203 170  90.79966735839844
    337121 2002  204 171  89.50199890136719
    337121 2003  301 172  87.90033721923828
    337121 2003  302 173  86.07633972167969
    337121 2003  303 174  84.18333435058594
    337121 2003  304 175  85.17766571044922
    337121 2004  401 176  87.84566497802734
    337121 2004  402 177  88.04000091552734
    337121 2004  403 178  86.95866394042969
    337121 2004  404 179   87.6709976196289
    337121 2005  501 180   87.3776626586914
    337121 2005  502 181  86.74933624267578
    337121 2005  503 182  84.38966369628906
    337121 2005  504 183  83.93866729736328
    337121 2006  601 184  83.03633880615234
    337121 2006  602 185  81.24566650390625
    337121 2006  603 186  78.73533630371094
    337121 2006  604 187  77.35800170898438
    337121 2007  701 188  76.13633728027344
    337121 2007  702 189  74.23566436767578
    337121 2007  703 190   71.7699966430664
    337121 2007  704 191  70.56500244140625
    337121 2008  801 192  70.28633880615234
    337121 2008  802 193   67.6163330078125
    337121 2008  803 194  64.13133239746094
    337121 2008  804 195  60.36433029174805
    337121 2009  901 196  55.15999984741211
    337121 2009  902 197 52.931331634521484
    337121 2009  903 198 51.308998107910156
    337121 2009  904 199  51.02899932861328
    337121 2010 1001 200   51.3466682434082
    337121 2010 1002 201  52.03066635131836
    337121 2010 1003 202  51.62099838256836
    337121 2010 1004 203   50.3503303527832
    337121 2011 1101 204  50.68266677856445
    337121 2011 1102 205  50.77033233642578
    337121 2011 1103 206  50.39666748046875
    337121 2011 1104 207 50.749000549316406
    337121 2012 1201 208  52.44300079345703
    337121 2012 1202 209 52.964332580566406
    337121 2012 1203 210   52.4119987487793
    337211 1997  971 148 31.117666244506836
    337211 1997  972 149 31.645666122436523
    337211 1997  973 150 32.244998931884766
    337211 1997  974 151  32.36433410644531
    337211 1998  981 152  33.00899887084961
    337211 1998  982 153  33.72066879272461
    337211 1998  983 154   34.0706672668457
    337211 1998  984 155 34.548667907714844
    337211 1999  991 156  34.12300109863281
    337211 1999  992 157 33.927669525146484
    337211 1999  993 158  33.86466979980469
    337211 1999  994 159   34.3390007019043
    337211 2000    1 160 34.244998931884766
    337211 2000    2 161 34.870330810546875
    337211 2000    3 162  35.48733139038086
    337211 2000    4 163 35.792667388916016
    337211 2001  101 164 35.465999603271484
    337211 2001  102 165 34.478668212890625
    337211 2001  103 166  33.31700134277344
    337211 2001  104 167 31.731000900268555
    337211 2002  201 168 30.307334899902344
    337211 2002  202 169  29.11400032043457
    337211 2002  203 170  28.66900062561035
    337211 2002  204 171 28.119665145874023
    337211 2003  301 172   26.6113338470459
    337211 2003  302 173 25.580665588378906
    337211 2003  303 174 25.306333541870117
    337211 2003  304 175 25.185333251953125
    337211 2004  401 176 24.277334213256836
    337211 2004  402 177 23.963666915893555
    337211 2004  403 178  23.93199920654297
    337211 2004  404 179 23.805999755859375
    337211 2005  501 180 23.669334411621094
    337211 2005  502 181  23.89466667175293
    337211 2005  503 182  24.40566635131836
    337211 2005  504 183 24.472999572753906
    337211 2006  601 184 24.952665328979492
    end
    format %tq ts

  • #2
    Gail:
    quoting an excerpt of the Technical note reported under -xt- entry, Stata .pdf manual, page 483, we ca read
    If a dataset does not contain a time variable, then panels are considered balanced

    if each panel contains the same number of observations; otherwise, the panels are unbalanced. When the dataset contains a time variable, panels are said to be strongly balanced if each panel

    contains the same time points, weakly balanced if each panel contains the same number of observations
    but not the same time points, and unbalanced otherwise
    In your case, your panels account for a different number of observations and different points in time:
    Code:
    . tab naic
    
           naic |      Freq.     Percent        Cum.
    ------------+-----------------------------------
         337121 |         63       63.00       63.00
         337211 |         37       37.00      100.00
    ------------+-----------------------------------
          Total |        100      100.00
    . bysort naic: tab ts

    ------------------------------------------------------------------------------------------------------------------------
    -> naic = 337121

    ts | Freq. Percent Cum.
    ------------+-----------------------------------
    1997q1 | 1 1.59 1.59
    1997q2 | 1 1.59 3.17
    1997q3 | 1 1.59 4.76
    1997q4 | 1 1.59 6.35
    1998q1 | 1 1.59 7.94
    1998q2 | 1 1.59 9.52
    1998q3 | 1 1.59 11.11
    1998q4 | 1 1.59 12.70
    1999q1 | 1 1.59 14.29
    1999q2 | 1 1.59 15.87
    1999q3 | 1 1.59 17.46
    1999q4 | 1 1.59 19.05
    2000q1 | 1 1.59 20.63
    2000q2 | 1 1.59 22.22
    2000q3 | 1 1.59 23.81
    2000q4 | 1 1.59 25.40
    2001q1 | 1 1.59 26.98
    2001q2 | 1 1.59 28.57
    2001q3 | 1 1.59 30.16
    2001q4 | 1 1.59 31.75
    2002q1 | 1 1.59 33.33
    2002q2 | 1 1.59 34.92
    2002q3 | 1 1.59 36.51
    2002q4 | 1 1.59 38.10
    2003q1 | 1 1.59 39.68
    2003q2 | 1 1.59 41.27
    2003q3 | 1 1.59 42.86
    2003q4 | 1 1.59 44.44
    2004q1 | 1 1.59 46.03
    2004q2 | 1 1.59 47.62
    2004q3 | 1 1.59 49.21
    2004q4 | 1 1.59 50.79
    2005q1 | 1 1.59 52.38
    2005q2 | 1 1.59 53.97
    2005q3 | 1 1.59 55.56
    2005q4 | 1 1.59 57.14
    2006q1 | 1 1.59 58.73
    2006q2 | 1 1.59 60.32
    2006q3 | 1 1.59 61.90
    2006q4 | 1 1.59 63.49
    2007q1 | 1 1.59 65.08
    2007q2 | 1 1.59 66.67
    2007q3 | 1 1.59 68.25
    2007q4 | 1 1.59 69.84
    2008q1 | 1 1.59 71.43
    2008q2 | 1 1.59 73.02
    2008q3 | 1 1.59 74.60
    2008q4 | 1 1.59 76.19
    2009q1 | 1 1.59 77.78
    2009q2 | 1 1.59 79.37
    2009q3 | 1 1.59 80.95
    2009q4 | 1 1.59 82.54
    2010q1 | 1 1.59 84.13
    2010q2 | 1 1.59 85.71
    2010q3 | 1 1.59 87.30
    2010q4 | 1 1.59 88.89
    2011q1 | 1 1.59 90.48
    2011q2 | 1 1.59 92.06
    2011q3 | 1 1.59 93.65
    2011q4 | 1 1.59 95.24
    2012q1 | 1 1.59 96.83
    2012q2 | 1 1.59 98.41
    2012q3 | 1 1.59 100.00
    ------------+-----------------------------------
    Total | 63 100.00

    ------------------------------------------------------------------------------------------------------------------------
    -> naic = 337211

    ts | Freq. Percent Cum.
    ------------+-----------------------------------
    1997q1 | 1 2.70 2.70
    1997q2 | 1 2.70 5.41
    1997q3 | 1 2.70 8.11
    1997q4 | 1 2.70 10.81
    1998q1 | 1 2.70 13.51
    1998q2 | 1 2.70 16.22
    1998q3 | 1 2.70 18.92
    1998q4 | 1 2.70 21.62
    1999q1 | 1 2.70 24.32
    1999q2 | 1 2.70 27.03
    1999q3 | 1 2.70 29.73
    1999q4 | 1 2.70 32.43
    2000q1 | 1 2.70 35.14
    2000q2 | 1 2.70 37.84
    2000q3 | 1 2.70 40.54
    2000q4 | 1 2.70 43.24
    2001q1 | 1 2.70 45.95
    2001q2 | 1 2.70 48.65
    2001q3 | 1 2.70 51.35
    2001q4 | 1 2.70 54.05
    2002q1 | 1 2.70 56.76
    2002q2 | 1 2.70 59.46
    2002q3 | 1 2.70 62.16
    2002q4 | 1 2.70 64.86
    2003q1 | 1 2.70 67.57
    2003q2 | 1 2.70 70.27
    2003q3 | 1 2.70 72.97
    2003q4 | 1 2.70 75.68
    2004q1 | 1 2.70 78.38
    2004q2 | 1 2.70 81.08
    2004q3 | 1 2.70 83.78
    2004q4 | 1 2.70 86.49
    2005q1 | 1 2.70 89.19
    2005q2 | 1 2.70 91.89
    2005q3 | 1 2.70 94.59
    2005q4 | 1 2.70 97.30
    2006q1 | 1 2.70 100.00
    ------------+-----------------------------------
    Total | 37 100.00
    Last edited by Carlo Lazzaro; 09 Mar 2019, 10:40.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo,
      I think this is only the case because of the dataex cutoff of my data, as my main dataset does contain the same points in time,

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input double naic float(year yq) double(ts l)
      337211 1997  971 148 31.117666244506836
      337211 1997  972 149 31.645666122436523
      337211 1997  973 150 32.244998931884766
      337211 1997  974 151  32.36433410644531
      337211 1998  981 152  33.00899887084961
      337211 1998  982 153  33.72066879272461
      337211 1998  983 154   34.0706672668457
      337211 1998  984 155 34.548667907714844
      337211 1999  991 156  34.12300109863281
      337211 1999  992 157 33.927669525146484
      337211 1999  993 158  33.86466979980469
      337211 1999  994 159   34.3390007019043
      337211 2000    1 160 34.244998931884766
      337211 2000    2 161 34.870330810546875
      337211 2000    3 162  35.48733139038086
      337211 2000    4 163 35.792667388916016
      337211 2001  101 164 35.465999603271484
      337211 2001  102 165 34.478668212890625
      337211 2001  103 166  33.31700134277344
      337211 2001  104 167 31.731000900268555
      337211 2002  201 168 30.307334899902344
      337211 2002  202 169  29.11400032043457
      337211 2002  203 170  28.66900062561035
      337211 2002  204 171 28.119665145874023
      337211 2003  301 172   26.6113338470459
      337211 2003  302 173 25.580665588378906
      337211 2003  303 174 25.306333541870117
      337211 2003  304 175 25.185333251953125
      337211 2004  401 176 24.277334213256836
      337211 2004  402 177 23.963666915893555
      337211 2004  403 178  23.93199920654297
      337211 2004  404 179 23.805999755859375
      337211 2005  501 180 23.669334411621094
      337211 2005  502 181  23.89466667175293
      337211 2005  503 182  24.40566635131836
      337211 2005  504 183 24.472999572753906
      337211 2006  601 184 24.952665328979492
      337211 2006  602 185 24.794334411621094
      337211 2006  603 186  24.88166618347168
      337211 2006  604 187 24.785999298095703
      337211 2007  701 188  24.41466522216797
      337211 2007  702 189 24.242334365844727
      337211 2007  703 190 24.422666549682617
      337211 2007  704 191 24.266000747680664
      337211 2008  801 192 24.023666381835938
      337211 2008  802 193 23.604333877563477
      337211 2008  803 194 23.258333206176758
      337211 2008  804 195 22.608333587646484
      337211 2009  901 196 20.220666885375977
      337211 2009  902 197   18.2983341217041
      337211 2009  903 198 17.699665069580078
      337211 2009  904 199 17.432666778564453
      337211 2010 1001 200 16.056333541870117
      337211 2010 1002 201  16.36566734313965
      337211 2010 1003 202 16.672000885009766
      337211 2010 1004 203  16.69300079345703
      337211 2011 1101 204 16.405000686645508
      337211 2011 1102 205 16.663000106811523
      337211 2011 1103 206 16.906999588012695
      337211 2011 1104 207 16.942665100097656
      337211 2012 1201 208  16.87933349609375
      337211 2012 1202 209 16.884000778198242
      337211 2012 1203 210 17.148000717163086
      end
      format %tq ts

      Comment


      • #4
        So, ideally panels run from 1997q1 to 2012q3 and are 16 x 4 - 1 = 63 long. Hence you're looking for a panel 62 long. One way to find that:

        Code:
        bysort naic : gen N = _N
        list if N < 63
        and optionally drop N later.

        Comment


        • #5
          That solved the problem, thanks Nick.

          Comment

          Working...
          X