Xtlogit: moving my 1's to last available data

John Rivers

Join Date: May 2016

Posts: 18
#1

Xtlogit: moving my 1's to last available data

26 Jul 2018, 07:49

Hello Statalist community,

I have panel data in which the groups are firms: there are 1s in the years in which there was a bankruptcy, and 0s everywhere else.

Quite often, the year in which I have my one (=bankruptcy) is a year where, naturally enough, there is no longer any data. For example, I will have a series for my X1....Xn explanatory variables from 1995 to 1999, for a bankruptcy in 2000. This is obviously problematic, since it will tend to be dropped from many estimations.

One logic that I have tried to explore is replace bankruptcy[_n-1]=F.bankruptcy if bankruptcy=1 & missing(x1). I then planned to make some sort of loop out of that.

STATA does not like this syntax, however. It's confess that it's pretty counterintuitive to replace the previous value, although that'd be the same as imposing conditions on future values (which I also tried).

Probably the answer is a whole different kind of code. Any suggestions?

Thank you so much for you help. I really appreciate it.

John
Tags: None
Jesse Tielens

Join Date: Jul 2018

Posts: 46
#2

26 Jul 2018, 08:30

So, your problem is that you want to replace current value for bankruptcy with 1, in all cases where the next period's bankruptcy indicator is 1 and the variable x1 is missing?

You can try something like this:

Code:

sort id year by id: replace bankruptcy = 1 if bankruptcy[_n+1] == 1 & x1[_n+1] ==. & year[_n+1]==year + 1

This should replace the current bankruptcy variable with '1', as long as the next year is '1', x1 is missing and there's no gap in the year variable.

The 'by' part ensures that we only do this for the same firm. So if there's a '1' for bankruptcy in the next row in your data, but the next row belongs to a different firm, we skip that one. This code snippet assumes your panel variable is ID and time variable is year.

Last edited by Jesse Tielens; 26 Jul 2018, 08:39. Reason: I wrote x1[_n+1] == 1. Should be: x1[_n+1]==. of course.
1 like
Comment
John Rivers

Join Date: May 2016

Posts: 18
#3

26 Jul 2018, 12:01

Thanks so much for your answer.

If I understand correctly, your line would add a "1" in the right place (assuming it only needs to be moved 1 year) but would not change the initial 1. Come to think of it, this wouldn't be the end of the world, aside from messing with some descriptive statistics. Perhaps I can create a duplicate "bankrutcy" variable (which would have both 1s) and only use that for estimations.

I'll give it a go and see how it works!
Comment
Jesse Tielens

Join Date: Jul 2018

Posts: 46
#4

26 Jul 2018, 12:20

If you want the 'original' bankruptcy variable to be changed as well, this isn't too hard. In that case, I'd create a new variable 'bankruptcy_dummy' that marks all observations that need changing. Next, replace the bankruptcy variable with '1' for a positive a dummy variable and delete the observation following that dummy. Finally, we can delete the dummy.

Code:

sort id year by id: generate bankruptcy_dummy = 1 if bankruptcy[_n+1] == 1 & x1[_n+1] ==. & year[_n+1]==year + 1 replace bankruptcy = 1 if bankruptcy_dummy == 1 //Now set all the original 'bankruptcies' to zero. by id: replace bankruptcy = 0 if bankruptcy_dummy[_n-1]==1 & year== year[_n+1]+1 drop bankruptcy_dummy

Let me know if that works!
Comment
John Rivers

Join Date: May 2016

Posts: 18
#5

26 Jul 2018, 13:57

You are the best! I think we're in business!

A question though: whats the purpose of the last condition (...& year[_n+1]==year + 1)? I left it out after I think both STATA and I were both a bit confused.
Comment
Jesse Tielens

Join Date: Jul 2018

Posts: 46
#6

26 Jul 2018, 14:06

If you have an unbalanced panel dataset, this might sometimes give problems. Imagine your data looks like this:

id year X1

1 2010 10

1 2011 12

1 2012 20

1 2013 50

2 2010 25

2 2011 15

2 2012 8

2 2014 12

If you're going over all the observations step-wise using the 'by' command, this could produce an error in your results at the last row.
Notice how the year 2013 is missing? The manual said to always add a check to see if the next observation's year is equal to current observation's year +1.

So in the last row: 2012 + 1 != 2014. So therefore, don't execute the code.

At least, that was my intention. But I must admit I'm new to this myself as well. If this is not the way to do it, I'm sure a more experienced commenter will correct me
Comment
John Rivers

Join Date: May 2016

Posts: 18
#7

26 Jul 2018, 14:14

Ah, gotcha! Ok. Well, you fixed my problem, so a million thanks. Huge relief!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35632
#8

28 Jul 2018, 02:30

Pleased you got a solution. For future posts please note:

1. This was cross-posted at https://www.reddit.com/r/stata/comme...vailable_data/ Please note our policy on cross-posting, which is that you are asked to tell us about it. I don't know what Reddit's policy is, but telling folks there about this thread would surely do no harm.

2. Acting on several items in the FAQ Advice would have helped this thread. See https://www.statalist.org/forums/help for that. #8 is the point above. #18 and especially #12 are also relevant.
Comment

id	year	X1
1	2010	10
1	2011	12
1	2012	20
1	2013	50
2	2010	25
2	2011	15
2	2012	8
2	2014	12

Announcement

Xtlogit: moving my 1's to last available data

Comment

Comment

Comment

Comment

Comment

Comment

Comment