Panel data: Computing stuff over every 5 years.

Sanjana Goswami

Join Date: May 2016

Posts: 17
#1

Panel data: Computing stuff over every 5 years.

10 Jul 2016, 17:36

Hi Stata folks

Here is an example of my data.

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input int(firm_id year) float high 1005 1974 . 1005 1975 49.5 1005 1976 . 1005 1977 58.8 1005 1979 43 1005 1980 . 1005 1981 . 1007 1974 . 1007 1975 -249.6 1007 1976 -43.6 1007 1977 -186.6 1007 1978 -498.9 1007 1979 101.9 1007 1980 -175.7 1007 1981 -59.1 1007 1984 -213.1 end

"high" is a variable that I have constructed which consists of only observations above an absolute threshold that I have picked.

I want to count the number of incidences of high. This is fairly easy to do by:

Code:

bysort firm_id: egen highfreq = count(high) if high != .

However, what I want to do is to count the number of incidences of high every block of five available years. (available because there are some year gaps and I want to think of the next available year as the next year; also there could be less than 5 years available after any given block like for firm number 1005, in which case I want to count for the remaining firms).

So instead of having a value of 8 for firm number 1007 from 1974-1984, I want to have 4 for 1974-1978 and then 4 for 1979 - 1984

How do I do this? Please help.

This will also help me in doing similar things for five year blocks.
Tags: None

Nick Cox

Join Date: Mar 2014
Posts: 35639

10 Jul 2016, 18:10

I don't quite follow your example. In particular, note that 1979 to 1984 is 6 years, not 5.

rangestat (SSC) will work with overlapping blocks. Search this forum for other examples.

panelthin (SSC) will select disjoint blocks.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int(firm_id year) float high
1005 1974      .
1005 1975   49.5
1005 1976      .
1005 1977   58.8
1005 1979     43
1005 1980      .
1005 1981      .
1007 1974      .
1007 1975 -249.6
1007 1976  -43.6
1007 1977 -186.6
1007 1978 -498.9
1007 1979  101.9
1007 1980 -175.7
1007 1981  -59.1
1007 1984 -213.1
end
capture ssc inst rangestat 
rangestat (count) high , int(year 0 4) by(firm_id) 
tsset firm_id year 
capture ssc inst panelthin 
panelthin, min(5) gen(marker) 
list, sepby(firm_id) 

     +---------------------------------------------+
     | firm_id   year     high   high_c~t   marker |
     |---------------------------------------------|
  1. |    1005   1974        .          2        1 |
  2. |    1005   1975     49.5          3        0 |
  3. |    1005   1976        .          2        0 |
  4. |    1005   1977     58.8          2        0 |
  5. |    1005   1979       43          1        1 |
  6. |    1005   1980        .          0        0 |
  7. |    1005   1981        .          0        0 |
     |---------------------------------------------|
  8. |    1007   1974        .          4        1 |
  9. |    1007   1975   -249.6          5        0 |
 10. |    1007   1976    -43.6          5        0 |
 11. |    1007   1977   -186.6          5        0 |
 12. |    1007   1978   -498.9          4        0 |
 13. |    1007   1979    101.9          3        1 |
 14. |    1007   1980   -175.7          3        0 |
 15. |    1007   1981    -59.1          2        0 |
 16. |    1007   1984   -213.1          1        1 |
     +---------------------------------------------+

Announcement

Panel data: Computing stuff over every 5 years.

Comment