Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Xtlogit: moving my 1's to last available data

    Hello Statalist community,

    I have panel data in which the groups are firms: there are 1s in the years in which there was a bankruptcy, and 0s everywhere else.

    Quite often, the year in which I have my one (=bankruptcy) is a year where, naturally enough, there is no longer any data. For example, I will have a series for my X1....Xn explanatory variables from 1995 to 1999, for a bankruptcy in 2000. This is obviously problematic, since it will tend to be dropped from many estimations.

    One logic that I have tried to explore is replace bankruptcy[_n-1]=F.bankruptcy if bankruptcy=1 & missing(x1). I then planned to make some sort of loop out of that.

    STATA does not like this syntax, however. It's confess that it's pretty counterintuitive to replace the previous value, although that'd be the same as imposing conditions on future values (which I also tried).

    Probably the answer is a whole different kind of code. Any suggestions?

    Thank you so much for you help. I really appreciate it.

    John

  • #2
    So, your problem is that you want to replace current value for bankruptcy with 1, in all cases where the next period's bankruptcy indicator is 1 and the variable x1 is missing?

    You can try something like this:
    Code:
    sort id year
    by id: replace bankruptcy = 1 if bankruptcy[_n+1] == 1 & x1[_n+1] ==. & year[_n+1]==year + 1
    This should replace the current bankruptcy variable with '1', as long as the next year is '1', x1 is missing and there's no gap in the year variable.

    The 'by' part ensures that we only do this for the same firm. So if there's a '1' for bankruptcy in the next row in your data, but the next row belongs to a different firm, we skip that one. This code snippet assumes your panel variable is ID and time variable is year.
    Last edited by Jesse Tielens; 26 Jul 2018, 08:39. Reason: I wrote x1[_n+1] == 1. Should be: x1[_n+1]==. of course.

    Comment


    • #3
      Thanks so much for your answer.

      If I understand correctly, your line would add a "1" in the right place (assuming it only needs to be moved 1 year) but would not change the initial 1. Come to think of it, this wouldn't be the end of the world, aside from messing with some descriptive statistics. Perhaps I can create a duplicate "bankrutcy" variable (which would have both 1s) and only use that for estimations.

      I'll give it a go and see how it works!

      Comment


      • #4
        If you want the 'original' bankruptcy variable to be changed as well, this isn't too hard. In that case, I'd create a new variable 'bankruptcy_dummy' that marks all observations that need changing. Next, replace the bankruptcy variable with '1' for a positive a dummy variable and delete the observation following that dummy. Finally, we can delete the dummy.

        Code:
        sort id year
        by id: generate bankruptcy_dummy = 1 if bankruptcy[_n+1] == 1 & x1[_n+1] ==. & year[_n+1]==year + 1
        replace bankruptcy = 1 if bankruptcy_dummy == 1
        
        //Now set all the original 'bankruptcies' to zero.
        by id: replace bankruptcy = 0 if bankruptcy_dummy[_n-1]==1 & year== year[_n+1]+1
        drop bankruptcy_dummy
        Let me know if that works!

        Comment


        • #5
          You are the best! I think we're in business!

          A question though: whats the purpose of the last condition (...& year[_n+1]==year + 1)? I left it out after I think both STATA and I were both a bit confused.

          Comment


          • #6
            If you have an unbalanced panel dataset, this might sometimes give problems. Imagine your data looks like this:
            id year X1
            1 2010 10
            1 2011 12
            1 2012 20
            1 2013 50
            2 2010 25
            2 2011 15
            2 2012 8
            2 2014 12
            If you're going over all the observations step-wise using the 'by' command, this could produce an error in your results at the last row.
            Notice how the year 2013 is missing? The manual said to always add a check to see if the next observation's year is equal to current observation's year +1.

            So in the last row: 2012 + 1 != 2014. So therefore, don't execute the code.

            At least, that was my intention. But I must admit I'm new to this myself as well. If this is not the way to do it, I'm sure a more experienced commenter will correct me

            Comment


            • #7
              Ah, gotcha! Ok. Well, you fixed my problem, so a million thanks. Huge relief!

              Comment


              • #8
                Pleased you got a solution. For future posts please note:

                1. This was cross-posted at https://www.reddit.com/r/stata/comme...vailable_data/ Please note our policy on cross-posting, which is that you are asked to tell us about it. I don't know what Reddit's policy is, but telling folks there about this thread would surely do no harm.

                2. Acting on several items in the FAQ Advice would have helped this thread. See https://www.statalist.org/forums/help for that. #8 is the point above. #18 and especially #12 are also relevant.

                Comment

                Working...
                X