Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • XTSET and TTSET Error Message

    Hello,

    I have a panel dataset which is evenly balanced but when I run xtset (or ttset) command to set region and time (year). I get a message that the panel has gaps. Listing the id and time shows that there isn't a gap so I'm not quite sure how to override this message so I can run a ITSA analysis. Please advise, I've never had this happen before?!?
    Code:
    input long region int year
     1 2006
     1 2007
     1 2008
     1 2009
     1 2010
     1 2011
     1 2012
     1 2013
     1 2014
     1 2015
     1 2016
     2 2006
     2 2007
     2 2008
     2 2009
     2 2010
     2 2011
     2 2012
     2 2013
     2 2014
     2 2015
     2 2016
     3 2006
     3 2007
     3 2008
     3 2009
     3 2010
     3 2011
     3 2012
     3 2013
     3 2014
     3 2015
     3 2016
     4 2006
     4 2007
     4 2008
     4 2009
     4 2010
     4 2011
     4 2012
     4 2013
     4 2014
     4 2015
     4 2016
     5 2006
     5 2007
     5 2008
     5 2009
     5 2010
     5 2011
     5 2012
     5 2013
     5 2014
     5 2015
     5 2016
     7 2006
     7 2007
     7 2008
     7 2009
     7 2010
     7 2011
     7 2012
     7 2013
     7 2014
     7 2015
     7 2016
     8 2006
     8 2007
     8 2008
     8 2009
     8 2010
     8 2011
     8 2012
     8 2013
     8 2014
     8 2015
     8 2016
     9 2006
     9 2007
     9 2008
     9 2010
     9 2011
     9 2012
     9 2013
     9 2014
     9 2015
     9 2016
    10 2006
    10 2007
    10 2008
    10 2009
    10 2010
    10 2011
    10 2012
    10 2013
    10 2014
    10 2015
    10 2016
    11 2006
    11 2007
    end
    label values region region2
    label def region2 1 "Chubu", modify
    label def region2 2 "Chugoku", modify
    label def region2 3 "Fukushima", modify
    label def region2 4 "Hokkaido", modify
    label def region2 5 "Iwate", modify
    label def region2 7 "Kansai", modify
    label def region2 8 "Kyushu_Okinawa", modify
    label def region2 9 "Miyagi", modify
    label def region2 10 "Northern_Kanto_Koshin", modify
    label def region2 11 "Shikoku", modify
    Code:
    xtset region year
           panel variable:  region (unbalanced)
            time variable:  year, 2006 to 2016, but with a gap
                    delta:  1 unit

  • #2
    Look again at Miyagi.

    Code:
    . list if region == 9
    
         +---------------+
         | region   year |
         |---------------|
     78. | Miyagi   2006 |
     79. | Miyagi   2007 |
     80. | Miyagi   2008 |
     81. | Miyagi   2010 |
     82. | Miyagi   2011 |
         |---------------|
     83. | Miyagi   2012 |
     84. | Miyagi   2013 |
     85. | Miyagi   2014 |
     86. | Miyagi   2015 |
     87. | Miyagi   2016 |
         +---------------+
    Notes: 1. I spotted this from scatter region year

    2. I presume that you mean tsset not ttset

    Comment


    • #3
      Thank you, Nick. Apparently my eyes missed even after reviewing it several times. And yes, I meant tsset, not ttset.

      Comment


      • #4
        Here are a few other ways of finding the gaps in panel data that is believed not to have any.
        Code:
        . xtset region year
               panel variable:  region (unbalanced)
                time variable:  year, 2006 to 2016, but with a gap
                        delta:  1 unit
        
        . xtdescribe
        
          region:  1, 2, ..., 11                                     n =         10
            year:  2006, 2007, ..., 2016                             T =         11
                   Delta(year) = 1 unit
                   Span(year)  = 11 periods
                   (region*year uniquely identifies each observation)
        
        Distribution of T_i:   min      5%     25%       50%       75%     95%     max
                                 2       2      11        11        11      11      11
        
             Freq.  Percent    Cum. |  Pattern
         ---------------------------+-------------
                8     80.00   80.00 |  11111111111
                1     10.00   90.00 |  11.........
                1     10.00  100.00 |  111.1111111
         ---------------------------+-------------
               10    100.00         |  XXXXXXXXXXX
        
        . by region: generate count = _N
        
        . list if count<11, noobs sepby(region)
        
          +------------------------+
          |  region   year   count |
          |------------------------|
          |  Miyagi   2006      10 |
          |  Miyagi   2007      10 |
          |  Miyagi   2008      10 |
          |  Miyagi   2010      10 |
          |  Miyagi   2011      10 |
          |  Miyagi   2012      10 |
          |  Miyagi   2013      10 |
          |  Miyagi   2014      10 |
          |  Miyagi   2015      10 |
          |  Miyagi   2016      10 |
          |------------------------|
          | Shikoku   2006       2 |
          | Shikoku   2007       2 |
          +------------------------+
        
        . fillin region year
        
        . list region year if _fillin, noobs sepby(region)
        
          +----------------+
          |  region   year |
          |----------------|
          |  Miyagi   2009 |
          |----------------|
          | Shikoku   2008 |
          | Shikoku   2009 |
          | Shikoku   2010 |
          | Shikoku   2011 |
          | Shikoku   2012 |
          | Shikoku   2013 |
          | Shikoku   2014 |
          | Shikoku   2015 |
          | Shikoku   2016 |
          +----------------+

        Comment


        • #5
          Thanks, William!

          Comment


          • #6
            I have nothing to add to Nick Cox' and William Lisowski's lovely solutions here. But I'll just seize the moment to comment on an approach to programming.

            One of the things I like about Stata is that it takes a skeptical approach to data and frequently checks for things that could easily escape the user's attention. The issuance of messages about things like gaps in the data by -xtset- or -tsset- is an example of that. Where Stata doesn't do that for you, it is good programming practice to liberally use -assert- (or, for some things -confirm-) commands in your programs to verify that your data really are what you think they should be, both when you first create your data sets, and as you proceed through data management and analysis. It is fine to use -list- and -browse- to inspect your data along the way. They can assure you that your data are not obviously wrong. But the human eye is very fallible. And for production work, you need the higher level of assurance provided by -assert- that your data are obviously not wrong.

            Comment


            • #7
              To recycle something Clyde wrote about a recent post of mine, my only regret is that I can upvote his post #6 only once.

              I cannot begin to count the number times assert has called my attention to my mistaken assumptions. I work with complicated multi-year longitudinal data, and there are assumptions I've made about the consistency of data - you'd think this year's age wouldn't differ from last year's age by 5 years, but that sort of thing creeps in - that would have driven me crazy looking at my results had I not found them well before that stage. And there are assumptions about the life course in general that I find unduly limited by my personal perspective and history.

              Also, to be totally honest, there were even more foolish programming errors I'd made that were caught as well. But even if you're a perfect programmer, you can't be sure you have perfect knowledge of your data, and too few of us have enough empathy to infer possibilities we have no knowledge or experience of.

              If I had to give up Stata, it's the assert command I'd find hardest to do without.

              Comment

              Working...
              X