Hi,
I am new to Stata and hope you can help. I am currently writing my thesis on how country level data affects firm level outcomes over time. I argue that the time level is nested in the firm level which is nested in the country level. I have chosen the -mixed- command for this. The time period is 2006-2017 (with variable name "t"), I have 5816 unique firms (with variable name f_id) and 33 unique countries (with variable name c_id). It is clear to me that the data should be in the long format, but I am in doubt as to whether the data structure I am using is correct, and I hope you could help with this. I apologize if the format is incorrect.
Should the data be structured as:
a) I only have 1 set of country observations. That is, I only have one observation for country 1 at 2006, one observation for country 2 at 2006 etc.
c_id f_id t
1 1 2006
1 1 2007
1 1 2008
2 2 2006
2 2 2007
2 2 2008
or
b) I repeat the country observation for every firm observation. That is, I have repeat country observations for country 1 at 2006 next to every firm from country 1 at 2006.
c_id f_id t
1 1 2006
1 1 2007
1 1 2008
1 2 2006
1 2 2007
1 2 2008
... ... ...
2 10 2006
2 10 2007
2 10 2008
2 11 2006
2 11 2007
2 11 2008
Thank you in advance
I am new to Stata and hope you can help. I am currently writing my thesis on how country level data affects firm level outcomes over time. I argue that the time level is nested in the firm level which is nested in the country level. I have chosen the -mixed- command for this. The time period is 2006-2017 (with variable name "t"), I have 5816 unique firms (with variable name f_id) and 33 unique countries (with variable name c_id). It is clear to me that the data should be in the long format, but I am in doubt as to whether the data structure I am using is correct, and I hope you could help with this. I apologize if the format is incorrect.
Should the data be structured as:
a) I only have 1 set of country observations. That is, I only have one observation for country 1 at 2006, one observation for country 2 at 2006 etc.
c_id f_id t
1 1 2006
1 1 2007
1 1 2008
2 2 2006
2 2 2007
2 2 2008
or
b) I repeat the country observation for every firm observation. That is, I have repeat country observations for country 1 at 2006 next to every firm from country 1 at 2006.
c_id f_id t
1 1 2006
1 1 2007
1 1 2008
1 2 2006
1 2 2007
1 2 2008
... ... ...
2 10 2006
2 10 2007
2 10 2008
2 11 2006
2 11 2007
2 11 2008
Thank you in advance
Comment