Dear Statalist Users,
I am not really familiar with Stata, I use it to write my dissertation at this moment. I am working with Stata 14. My dataset is quite large ~ 400.000 observations; and I want to to a regression with fixed effects.
I understand that the first command is letting Stata know that there is Panel Data, with the following syntax:
xtset panelvar timevar
However, if i do this with my variables I get the following error:
repeated time values within panel
r(451);
I know that I have repeated time values; because I have different weeks and years in my dataset. But I don't know how to fix this because the weeks and years are unique identifiers of my data set. This is what my data set looks like:
Product_Number - Product - Week - Year - Retailer - Variable_X - Lag_Variable_X
1234 | product 1 |week 13 |2018 |retailer x | 0.2 | .
1234 | product 1 |week 13 |2018 |retailer y | 0.3 | 0.2
1234 | product 1 |week 13 |2018 |retailer z | 0.4 | 0.3
1234 | product 1 |week 13 |2019 |retailer x | 0 |
1234 | product 1 |week 13 |2019 |retailer y | 0.3 |
4567 | product 2 |week 13 |2018 |retailer z | 0.4 |
4567 | product 2 |week 13 |2019 |retailer x | 0.5 | .
4567 | product 2 |week 13 |2019 |retailer z | 0.2 | 0.5
...
5678 | product 3 |week 52 |2019 |retailer x | 0.3 |
etc.
What steps did I take for the Xtset?
1. egen time = group (year week)
2. sort product_number week
3. by product_number: gen lag_variable_x = variable_x[n-1] - I think the lag is not relevant in this problem
So for the xtset command I used product_number as panelvar and time als timevar.
-> xtset product_number time
I hope I explained it clearly. What did I do wrong?
I am not really familiar with Stata, I use it to write my dissertation at this moment. I am working with Stata 14. My dataset is quite large ~ 400.000 observations; and I want to to a regression with fixed effects.
I understand that the first command is letting Stata know that there is Panel Data, with the following syntax:
xtset panelvar timevar
However, if i do this with my variables I get the following error:
repeated time values within panel
r(451);
I know that I have repeated time values; because I have different weeks and years in my dataset. But I don't know how to fix this because the weeks and years are unique identifiers of my data set. This is what my data set looks like:
Product_Number - Product - Week - Year - Retailer - Variable_X - Lag_Variable_X
1234 | product 1 |week 13 |2018 |retailer x | 0.2 | .
1234 | product 1 |week 13 |2018 |retailer y | 0.3 | 0.2
1234 | product 1 |week 13 |2018 |retailer z | 0.4 | 0.3
1234 | product 1 |week 13 |2019 |retailer x | 0 |
1234 | product 1 |week 13 |2019 |retailer y | 0.3 |
4567 | product 2 |week 13 |2018 |retailer z | 0.4 |
4567 | product 2 |week 13 |2019 |retailer x | 0.5 | .
4567 | product 2 |week 13 |2019 |retailer z | 0.2 | 0.5
...
5678 | product 3 |week 52 |2019 |retailer x | 0.3 |
etc.
What steps did I take for the Xtset?
1. egen time = group (year week)
2. sort product_number week
3. by product_number: gen lag_variable_x = variable_x[n-1] - I think the lag is not relevant in this problem
So for the xtset command I used product_number as panelvar and time als timevar.
-> xtset product_number time
I hope I explained it clearly. What did I do wrong?
Comment