Hello Statalists,
I have a question about dropping observations in STATA.
My dataset is a panel with many firms, and each firm has a few years. But the panel is not balanced, as a lot of firms don't have obs for all years (2003-2012). If a firm doesn't have some years, they are not showing "." in the data, they just don't exist in the data. The example of the dataset is as below.
id year VAR
1 2009 6
1 2010 7
1 2011 8
1 2012 9
2 2003 0
2 2004 1
2 2005 2
2 2006 3
2 2007 4
2 2008 5
2 2009 6
2 2010 7
2 2011 8
2 2012 9
In the example above, id=2 has the full rank of years. However, id=1 does not have years before 2009. I would like to drop all the id that don't have years before 2009. I tried the following code but it didn't work.
bysort id (VAR): drop if missing(VAR<=5)
May I have your advice on how to deal with this issue?
Also, I would also like to try running regressions only on the firms that have the full rank of years. How can I keep these firms and drop all the others?
Thank you very much,
Chenli
I have a question about dropping observations in STATA.
My dataset is a panel with many firms, and each firm has a few years. But the panel is not balanced, as a lot of firms don't have obs for all years (2003-2012). If a firm doesn't have some years, they are not showing "." in the data, they just don't exist in the data. The example of the dataset is as below.
id year VAR
1 2009 6
1 2010 7
1 2011 8
1 2012 9
2 2003 0
2 2004 1
2 2005 2
2 2006 3
2 2007 4
2 2008 5
2 2009 6
2 2010 7
2 2011 8
2 2012 9
In the example above, id=2 has the full rank of years. However, id=1 does not have years before 2009. I would like to drop all the id that don't have years before 2009. I tried the following code but it didn't work.
bysort id (VAR): drop if missing(VAR<=5)
May I have your advice on how to deal with this issue?
Also, I would also like to try running regressions only on the firms that have the full rank of years. How can I keep these firms and drop all the others?
Thank you very much,
Chenli
Comment