Hello, I have the following panel data where pid= identification of individuals and time=date that they were in the labour market. The data comes from a Population Survey. The number of observations are 40.000.000 and the number of individuals 400.000 (aprox). The time goes from 1992/02 to 2016/12. I want to tell stata that I have a panel data in order to run a regression but when I run >>xtset pid time<< it says
Moreover, if I run the following, >>xtset pid<< it says "panel variable: pid (unbalanced)" so It seems fine, but I am not controlling for time...
So my questions are, Why stata does not recognized my panel data? What am I doing wrong? What is the proper command?
The idea is to run the following linear regression: temporary= Bo+B1*temporary(previous period)*sex+....
Thank you in advanced for your help,
Cristina.
"repeated time values within panel
r(451);"
I plug the command >>duplicates list pid time<< And it gives me a huge list of duplicate variables, but I don´t understand why I have duplicated variables over time. r(451);"
Moreover, if I run the following, >>xtset pid<< it says "panel variable: pid (unbalanced)" so It seems fine, but I am not controlling for time...
So my questions are, Why stata does not recognized my panel data? What am I doing wrong? What is the proper command?
The idea is to run the following linear regression: temporary= Bo+B1*temporary(previous period)*sex+....
Thank you in advanced for your help,
Cristina.
Comment