Dear Statalist users,
Assume that we have firm-level data and we want to calculate the residuals from a regression by industry and year. For example, assume that we have two years of data, 2000 and 2001, and two industries, 1 and 2.
For example, assume that the data looks like this.
A simple regression should be run for each year and group, and as long as that group-year pair has at least 5 observations, a column should be generated with the residuals. For example, for group 1 and year 2000, since we have only 4 observations, the residuals should not be calculated.
I reckon that a double loop might be needed here, but with I am stuck a bit.
How would one solve this problem?
Assume that we have firm-level data and we want to calculate the residuals from a regression by industry and year. For example, assume that we have two years of data, 2000 and 2001, and two industries, 1 and 2.
For example, assume that the data looks like this.
Code:
input firm_id year group y x 1 2000 1 . 0.587681712 2 2000 1 0.220527045 0.020397402 3 2000 1 0.598266595 0.276305834 4 2000 1 0.848473642 0.503412603 5 2000 1 0.566881707 0.143577444 6 2000 1 0.69872904 . 1 2001 1 0.581496724 0.952147679 2 2001 1 0.447513514 0.547753335 3 2001 1 0.492024424 0.380500378 4 2001 1 0.913852189 0.396933955 5 2001 1 0.181215711 0.220948854 6 2001 1 0.393435702 0.974829582 7 2000 2 0.035029052 0.080399976 8 2000 2 0.552878997 0.163354383 9 2000 2 0.55373046 0.543578162 10 2000 2 0.272902519 0.870706828 11 2000 2 0.700316363 0.262667598 12 2000 2 0.485204026 0.970839238 6 2001 2 0.238687785 0.488399578 7 2001 2 0.844819818 0.849078286 8 2001 2 0.139093221 0.73683734 9 2001 2 0.397981489 0.503380686 10 2001 2 0.127906763 0.954909727 11 2001 2 0.118464559 0.656839917 12 2001 2 0.608098688 0.304986828 end
I reckon that a double loop might be needed here, but with I am stuck a bit.
How would one solve this problem?
Comment