Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • tsfill does not work

    Hello members,

    After reshaping my dataset from wide to long, I am trying to fill in the years using tsfill the following way:

    reshape long year, i( RowId ) j(Year)
    tsset RowId year
    tsfill

    The tsfill command does not work( the gaps in the year variable are not filled). I try using tsfill with the full option( to test) - it works. However, I do not want a balanced panel.

    Could you please suggest what I might be doing wrong?

    Rgds,
    Archita

  • #2
    So, I assume you started out with a wide data set that had variables that looked something like this:

    Code:
        RowID   year2005   year2006   year2007   year2008   year2009   year2010 
            1         51         58         52         44         61         67 
            2         50         60         57         50         53         62 
            3         51         50         50         60         52         46 
            4         51         47         56         50         43         46 
            5         47         54         48         42         45         64
    Note: The values of the year2005-year2010 variable are just random numbers used to illustrate the problem here.

    Then you do a rather unusual reshape

    Code:
    . reshape long year, i(RowID) j(Year)
    (note: j = 2005 2006 2007 2008 2009 2010)
    
    Data                               wide   ->   long
    -----------------------------------------------------------------------------
    Number of obs.                       25   ->     150
    Number of variables                   7   ->       3
    j variable (6 values)                     ->   Year
    xij variables:
             year2005 year2006 ... year2010   ->   year
    -----------------------------------------------------------------------------
    
    . list in 1/15, noobs clean
    
        RowID   Year   year 
            1   2005     51 
            1   2006     58 
            1   2007     52 
            1   2008     44 
            1   2009     61 
            1   2010     67 
            2   2005     50 
            2   2006     60 
            2   2007     57 
            2   2008     50 
            2   2009     53 
            2   2010     62 
            3   2005     51 
            3   2006     50 
            3   2007     50
    Nothing wrong with that, though it is asking for trouble to have two variables named year and Year. Especially when the one called year has (at least in this example, and I'll bet in your data too) nothing to do with time.

    And then comes your fatal mistake:

    Code:
    // WHAT YOU DID--WRONG
    tsset RowID year
    
    // WHAT YOU SHOULD HAVE DONE
    tsset RowID Year
    So you set yourself up for this confusion and then you fell for it. You -tsset- your data with the wrong time variable.

    Now, I think also that either your data are quite unusual, or you didn't tell us the full story. I say that because -tsset- is rather particular about a few things: the values of the time variable in -tsset- must contain only integer values, and there can be no repeated values of that time variable within any level of the panel variable. You don't tell us what the original year2005-year2010 variables represent, but unless they are integers and there no repeated values within a single observation, -tsset- should have given you an error message and you would never have gotten to -tsfill-.

    Be that as it may, I'm pretty sure that fixing your -tsset- command will solve the problem. But, really, it's terrible programming practice to have two variables whose names are so similar, unless they are very closely related to each other conceptually and there is some clear paradigm for which variable begins in lower case and which in upper case. So, to keep yourself from further confusing these variables, I suggest that after the reshape you rename year to something else, preferably something descriptive of what the variable actually is.

    Comment


    • #3
      Hello Clyde,

      Thanks for your kind reply.

      I agree that naming two variables similarly is not a good practice. I will be mindful going forward.

      The 'year' variable is the correct time variable. I should have described the dataset in my previous post. In the input dataset, I had year1 and year2 that stood for start year and end year. After reshape, the 'Year' variable was equal to 1 for the observations corresponding to year1 and 2 for the the observations corresponding to year2.

      The problem appears to be solved for the moment. In my dataset, I had instances where year1=year2. When I removed the rows corresponding to those observations and ran tsfill again, it worked. So, I assume that tsfill works when there are gaps in all the values of the time variable. As you rightly pointed out, the issue was that there were repeated values of the time variable within a level of the panel variable.

      Rgds,
      Archita

      Comment

      Working...
      X