Hi,
It is my first post and I will try to be as clear as possible. The link for the main database is at the end of the post.
Context
I am using Stata/SE 12.0 under Windows 10. I have started with Stata only a few weeks ago and I am trying to learn on my own for an assignment due in a few days now (because each table or figure took me days and days): replicating the paper "Does Compulsory School Attendance Affect Schoolig and Earnings":
http://web.stanford.edu/~pista/angrist.pdf
that consists of showing that people born in last quarters of the years have more education on average than those born in the first quarters due to compulsory schooling laws. The first figures draw a graph of the average number of years of education (variable EDUC) for all people born a certain year (variable YOB for year of birth) during a certain quarter (QOB). There is a general increasing trend and to detrend the data, they use a moving average (figure IV), which is where I have been blocked for the last 5 days.
Problem
In the database, there are 27 variables among which v4 renamed EDUC, v27 renamed YOB (year of birth), and v18 renamed QOB (quarter of birth). What is needed for the moving average is, for every set of people born in year c and quarter j, calculating the average number of years of education not for this year and quarter, but for the quarter just before, 2 quarters before, one quarter later and 2 quarters later (explained p. 985 of the paper).
For example, if I look at the men born between 1930 and 1939 as in this figure (figure IV of the article:
https://onedrive.live.com/redir?resi...nt=photo%2cpng),
I need to start with the cohort born in 1930, 3rd quarter and compute the average number of years of education of those born in 1930, 2nd quarter (born one quarter before the given cohort), same for those born in 1930, 1st quarter (born 2 quarters before the given cohort), same for those born in 1930, 4th quarter (one quarter after the given cohort), and same for those born in 1931, 1st quarter (2 quarters after the given cohort). Then the moving average is obtained by adding these 4 values and dividing by 4. This whole process should be repeated for each cohort between 1930, 3rd quarter and 1939, 2nd quarter.
Do-File
For the do-file
(https://onedrive.live.com/redir?resid=6919D329B3BF1EF2!3227&authkey=!AO2cxEN AGpZMgsM&ithint=file%2cdo),
I started with the model of the other figures and tried to use foreach loop and many other things (do not remember the error messages/did not know I was going to post here) but still do not figure out how to tell Stata:
"for each YOBQ[n], compute mean (EDUC) of YOBQ[n-1], YOBQ[n-2], YOBQ[n+1], YOBQ[n+2]". To make the sum and divide by 4 after that it should be easier.
I have been given an exceptional hint from the teaching assistant: "try the tssmooth command. You will first have to create a time variable for which the egen group command will be very useful." but according to my research about "egen" and "tsset" in the data manuals and in the book Cameron & Trivedi, "Econometrics using Stata" (last link):
http://www.stata.com/manuals14/degen...t=folder%2cdta
http://www.stata.com/manuals14/gsw11.pdf
http://www.stata.com/manuals14/u11.p...Languagesyntax
http://www.stata.com/manuals14/u13.p...itsubscripting
https://onedrive.live.com/redir?resi...int=file%2cpdf
I should tsset the data before tssmooth but I did not get past this stage since apparently, the notation [n] is not allowed with "egen" (error r(101) "weights not allowed") and I am still very confused with how to combine egen, tsset and tssmooth.
It would be great if someone could help me with how to solve the "weights not allowed" error and how to combine the commands "egen", "tsset", and "tssmooth".
Thank you so much!
Postscript: here is the database by the way https://onedrive.live.com/redir?resi...t=folder%2cdta
Note: I have the do-file for the most important other figures and tables of the article, except table I but this file is probably not necessary/just for info:
https://onedrive.live.com/redir?resi...hint=file%2cdo
It is my first post and I will try to be as clear as possible. The link for the main database is at the end of the post.
Context
I am using Stata/SE 12.0 under Windows 10. I have started with Stata only a few weeks ago and I am trying to learn on my own for an assignment due in a few days now (because each table or figure took me days and days): replicating the paper "Does Compulsory School Attendance Affect Schoolig and Earnings":
http://web.stanford.edu/~pista/angrist.pdf
that consists of showing that people born in last quarters of the years have more education on average than those born in the first quarters due to compulsory schooling laws. The first figures draw a graph of the average number of years of education (variable EDUC) for all people born a certain year (variable YOB for year of birth) during a certain quarter (QOB). There is a general increasing trend and to detrend the data, they use a moving average (figure IV), which is where I have been blocked for the last 5 days.
Problem
In the database, there are 27 variables among which v4 renamed EDUC, v27 renamed YOB (year of birth), and v18 renamed QOB (quarter of birth). What is needed for the moving average is, for every set of people born in year c and quarter j, calculating the average number of years of education not for this year and quarter, but for the quarter just before, 2 quarters before, one quarter later and 2 quarters later (explained p. 985 of the paper).
For example, if I look at the men born between 1930 and 1939 as in this figure (figure IV of the article:
https://onedrive.live.com/redir?resi...nt=photo%2cpng),
I need to start with the cohort born in 1930, 3rd quarter and compute the average number of years of education of those born in 1930, 2nd quarter (born one quarter before the given cohort), same for those born in 1930, 1st quarter (born 2 quarters before the given cohort), same for those born in 1930, 4th quarter (one quarter after the given cohort), and same for those born in 1931, 1st quarter (2 quarters after the given cohort). Then the moving average is obtained by adding these 4 values and dividing by 4. This whole process should be repeated for each cohort between 1930, 3rd quarter and 1939, 2nd quarter.
Do-File
For the do-file
(https://onedrive.live.com/redir?resid=6919D329B3BF1EF2!3227&authkey=!AO2cxEN AGpZMgsM&ithint=file%2cdo),
I started with the model of the other figures and tried to use foreach loop and many other things (do not remember the error messages/did not know I was going to post here) but still do not figure out how to tell Stata:
"for each YOBQ[n], compute mean (EDUC) of YOBQ[n-1], YOBQ[n-2], YOBQ[n+1], YOBQ[n+2]". To make the sum and divide by 4 after that it should be easier.
I have been given an exceptional hint from the teaching assistant: "try the tssmooth command. You will first have to create a time variable for which the egen group command will be very useful." but according to my research about "egen" and "tsset" in the data manuals and in the book Cameron & Trivedi, "Econometrics using Stata" (last link):
http://www.stata.com/manuals14/degen...t=folder%2cdta
http://www.stata.com/manuals14/gsw11.pdf
http://www.stata.com/manuals14/u11.p...Languagesyntax
http://www.stata.com/manuals14/u13.p...itsubscripting
https://onedrive.live.com/redir?resi...int=file%2cpdf
I should tsset the data before tssmooth but I did not get past this stage since apparently, the notation [n] is not allowed with "egen" (error r(101) "weights not allowed") and I am still very confused with how to combine egen, tsset and tssmooth.
It would be great if someone could help me with how to solve the "weights not allowed" error and how to combine the commands "egen", "tsset", and "tssmooth".
Thank you so much!
Postscript: here is the database by the way https://onedrive.live.com/redir?resi...t=folder%2cdta
Note: I have the do-file for the most important other figures and tables of the article, except table I but this file is probably not necessary/just for info:
https://onedrive.live.com/redir?resi...hint=file%2cdo
Comment