Hi! I have a time series database with information about school principals and students average scores on a standardized test between 2012-2018.
Here is an example of the data:
where “school” is the school id, “principal” the principal’s id and “score” the average score of the students attending “i” school in each year. Also, I create two auxiliar variables:
- a dummy variable “change” equal to 1 when the school principal changes and 0 if not.
- a variable “change_year” equal to the year when the principal change.
I only have schools where the principal changes one time between 2012-2018.
I want to create two lagged variables to use as controls in a regression:
1. score_L1: equal to the average score of students one year before the change of the principal.
2. score_L2: equal to the average score of students two years before the change of the principal.
The two of them should be constant between schools.
How can I do it?
Thanks you very much!
Here is an example of the data:
Code:
input double(school year) long principal float change double score float change_year 1 2012 100 0 28 2016 1 2013 100 0 27 2016 1 2014 100 0 26 2016 1 2015 100 0 27 2016 1 2016 200 1 26 2016 1 2017 200 0 26 2016 1 2018 200 0 26 2016 2 2012 305 0 27 2018 2 2013 305 0 26 2018 2 2014 305 0 23 2018 2 2015 305 0 22 2018 2 2016 305 0 22 2018 2 2017 305 0 26 2018 2 2018 500 1 27 2018 end
- a dummy variable “change” equal to 1 when the school principal changes and 0 if not.
- a variable “change_year” equal to the year when the principal change.
I only have schools where the principal changes one time between 2012-2018.
I want to create two lagged variables to use as controls in a regression:
1. score_L1: equal to the average score of students one year before the change of the principal.
2. score_L2: equal to the average score of students two years before the change of the principal.
The two of them should be constant between schools.
How can I do it?
Thanks you very much!

Comment