Dear Statalists,
I have a panel data that contains people's income (in 1,000 dollars) in years they are surveyed. In some years the data are missing, also it is not a balanced panel.
*=================*
clear
input str5 id year income
A 2005 110
A 2006 160
A 2007 .
B 2005 60
B 2006 80
B 2007 .
B 2008 .
C 2005 200
C 2006 .
C 2007 170
C 2008 .
D 2005 70
D 2006 80
D 2007 90
D 2008 100
end
list, sepby(id)
*=================*
I want to create a variable md_income that takes the income value of the median survey year (in case there are two years in the middle, the income of whichever year in which income data is not missing).
In the above data, I want a variable md_income that takes value 160 for A, 80 for B, 170 for C, and 85 for D.
How can I do that? I am at my wit's end.
I have a panel data that contains people's income (in 1,000 dollars) in years they are surveyed. In some years the data are missing, also it is not a balanced panel.
*=================*
clear
input str5 id year income
A 2005 110
A 2006 160
A 2007 .
B 2005 60
B 2006 80
B 2007 .
B 2008 .
C 2005 200
C 2006 .
C 2007 170
C 2008 .
D 2005 70
D 2006 80
D 2007 90
D 2008 100
end
list, sepby(id)
*=================*
I want to create a variable md_income that takes the income value of the median survey year (in case there are two years in the middle, the income of whichever year in which income data is not missing).
In the above data, I want a variable md_income that takes value 160 for A, 80 for B, 170 for C, and 85 for D.
How can I do that? I am at my wit's end.
Comment