Dear Statalist,
I am working with a set of panel data which I, due to data protection, am only able to access via e-mail, sending code to the institution providing me with the data. (It`s the German socio-economic panel, I`m using their soep-remote system in case anyone reading this worked with it before or will in the future) Therefore, I send my do-files as text in an e-mail, the institution runs it through and I receive the log-file in another e-mail.
This process bears some limitations one of them making it impossible to merge data from my own computer with their data. The only way the system allows me to work with external data is by sending it within an e-mail, using the input command. My problem is that I cannot think of an easy solution working with the external data after inputting it with the input command. Here`s what I`m trying to do:
The panel data of the institution looks like this:
Where id= PersonID, Region=Region Identifier that person is living in, syear= survey year
(This is a fictious example, sorry for not being able to post the original data. The reason is that I can neither use the list or the dataex command for the same reason of data protection)
My additional data is structured as follows:
region2= same region ID, expend= government expenditures within region
I`m searching for a way to create a variable that provides the information about the government expenditures for the region the individuals are living in,
e.g. in order to calculate regressions on the relationship between government expenditures and income.
I tried to generate a new variable and replace it, but that`s not the right approach, as a lot of information is being lost this way.
I guess I would need a code (probably a loop?) that runs through the data and checks the region and year for all ids and then replaces expend2 with the fitting expenditures per region2/syear2. But I have no clue, maybe there might be a much easier solution?
If anyone can think of a solution I would be really grateful!
Thank you in advance! Any further information that might be required I will post asap.
Thanks a lot and have a nice day!
Benedikt
I am working with a set of panel data which I, due to data protection, am only able to access via e-mail, sending code to the institution providing me with the data. (It`s the German socio-economic panel, I`m using their soep-remote system in case anyone reading this worked with it before or will in the future) Therefore, I send my do-files as text in an e-mail, the institution runs it through and I receive the log-file in another e-mail.
This process bears some limitations one of them making it impossible to merge data from my own computer with their data. The only way the system allows me to work with external data is by sending it within an e-mail, using the input command. My problem is that I cannot think of an easy solution working with the external data after inputting it with the input command. Here`s what I`m trying to do:
The panel data of the institution looks like this:
Code:
clear input long(id region income syear) 1 1001 4634 2000 1 1001 5380 2001 1 1001 3981 2002 1 1001 4574 2003 2 1001 3181 2000 2 1001 3430 2001 2 1001 2803 2002 2 1001 2349 2003 3 1001 2824 2001 4 1002 2041 2000 4 1002 3100 2001 4 1001 3300 2002 5 1057 6000 2000 end
(This is a fictious example, sorry for not being able to post the original data. The reason is that I can neither use the list or the dataex command for the same reason of data protection)
My additional data is structured as follows:
Code:
input long (region2 expend syear2) 1001 9000 2000 1001 8000 2001 1001 9000 2002 1001 10000 2003 1002 7000 2000 1002 200 2001 1002 5600 2002 1002 900 2003 1006 10000 2000 1006 2000 2001 end
I`m searching for a way to create a variable that provides the information about the government expenditures for the region the individuals are living in,
e.g. in order to calculate regressions on the relationship between government expenditures and income.
I tried to generate a new variable and replace it, but that`s not the right approach, as a lot of information is being lost this way.
Code:
gen expend2=. replace expend2==expend if syear==syear2 & region=region2
If anyone can think of a solution I would be really grateful!
Thank you in advance! Any further information that might be required I will post asap.
Thanks a lot and have a nice day!
Benedikt
Comment