Greetings,
I´m working in a dataset that goes from 2010-2022. I have a panel with firm and year. I'm cleaning the data that contains some dependent variable missing values, and I used the following code to select the firms with at least 5 consecutive years:
gen run = .
by id: replace run = cond(L.run == ., 1, L.run + 1)
by id: egen maxrun = max(run)
by id: drop if maxrun <5
Now, some firms have breaks in the years of the dependent variable. For example, 6 consecutive years (2010-2015) and 2 consecutive years (2018-2019).
I want to eliminate the break with less than 5 consecutive years per company, but I can't find any code to run ( in the example, above I just want to drop the second break from 2018-2019).
Any suggestion,
Thank you
Nuno
I´m working in a dataset that goes from 2010-2022. I have a panel with firm and year. I'm cleaning the data that contains some dependent variable missing values, and I used the following code to select the firms with at least 5 consecutive years:
gen run = .
by id: replace run = cond(L.run == ., 1, L.run + 1)
by id: egen maxrun = max(run)
by id: drop if maxrun <5
Now, some firms have breaks in the years of the dependent variable. For example, 6 consecutive years (2010-2015) and 2 consecutive years (2018-2019).
I want to eliminate the break with less than 5 consecutive years per company, but I can't find any code to run ( in the example, above I just want to drop the second break from 2018-2019).
Any suggestion,
Thank you
Nuno
Comment