Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Moving Average: How to tsset and tssmooth?

    Hi,

    It is my first post and I will try to be as clear as possible. The link for the main database is at the end of the post.

    Context
    ​I am using Stata/SE 12.0 under Windows 10. I have started with Stata only a few weeks ago and I am trying to learn on my own for an assignment due in a few days now (because each table or figure took me days and days): replicating the paper "Does Compulsory School Attendance Affect Schoolig and Earnings":

    http://web.stanford.edu/~pista/angrist.pdf
    that consists of showing that people born in last quarters of the years have more education on average than those born in the first quarters due to compulsory schooling laws. The first figures draw a graph of the average number of years of education (variable EDUC) for all people born a certain year (variable YOB for year of birth) during a certain quarter (QOB). There is a general increasing trend and to detrend the data, they use a moving average (figure IV), which is where I have been blocked for the last 5 days.

    Problem
    In the database, there are 27 variables among which v4 renamed EDUC, v27 renamed YOB (year of birth), and v18 renamed QOB (quarter of birth). What is needed for the moving average is, for every set of people born in year c and quarter j, calculating the average number of years of education not for this year and quarter, but for the quarter just before, 2 quarters before, one quarter later and 2 quarters later (explained p. 985 of the paper).

    For example, if I look at the men born between 1930 and 1939 as in this figure (figure IV of the article:

    https://onedrive.live.com/redir?resi...nt=photo%2cpng),
    I need to start with the cohort born in 1930, 3rd quarter and compute the average number of years of education of those born in 1930, 2nd quarter (born one quarter before the given cohort), same for those born in 1930, 1st quarter (born 2 quarters before the given cohort), same for those born in 1930, 4th quarter (one quarter after the given cohort), and same for those born in 1931, 1st quarter (2 quarters after the given cohort). Then the moving average is obtained by adding these 4 values and dividing by 4. This whole process should be repeated for each cohort between 1930, 3rd quarter and 1939, 2nd quarter.

    Do-File
    For the do-file
    (
    https://onedrive.live.com/redir?resid=6919D329B3BF1EF2!3227&authkey=!AO2cxEN AGpZMgsM&ithint=file%2cdo),
    I started with the model of the other figures and tried to use foreach loop and many other things (do not remember the error messages/did not know I was going to post here)
    but still do not figure out how to tell Stata:
    "for each YOBQ[n], compute mean (EDUC) of YOBQ[n-1], YOBQ[n-2], YOBQ[n+1], YOBQ[n+2]". To make the sum and divide by 4 after that it should be easier.

    I have been given an exceptional hint from the teaching assistant: "try the tssmooth command. You will first have to create a time variable for which the egen group command will be very useful." but according to my research about "egen" and "tsset" in the data manuals and in the book Cameron & Trivedi, "Econometrics using Stata" (last link):
    http://www.stata.com/manuals14/degen...t=folder%2cdta
    http://www.stata.com/manuals14/gsw11.pdf
    http://www.stata.com/manuals14/u11.p...Languagesyntax
    http://www.stata.com/manuals14/u13.p...itsubscripting
    https://onedrive.live.com/redir?resi...int=file%2cpdf
    I should tsset the data before tssmooth but I did not get past this stage since apparently, the notation [n] is not allowed with "egen" (error r(101) "weights not allowed") and I am still very confused with how to combine egen, tsset and tssmooth.

    It would be great if someone could help me with how to solve the "weights not allowed" error and how to combine the commands "egen", "tsset", and "tssmooth".


    Thank you so much!
    Postscript: here is the database by the way https://onedrive.live.com/redir?resi...t=folder%2cdta
    Note: I have the do-file for the most important other figures and tables of the article, except table I but this file is probably not necessary/just for info:
    https://onedrive.live.com/redir?resi...hint=file%2cdo
    Last edited by Amarylis Durand; 25 Mar 2016, 01:55.

  • #2
    Here is a shorter version of my question: how to avoid the error below (r451 in bold) and how to tell Stata that the moving average of "medstay1" should be calculated for every value of tps?


    /* by YOB QOB = for all those born the same year and the same quarter, sort by increasing years and quarters and calculate the average number of years of education*/
    by YOB QOB , sort: egen medstay1 = mean(EDUC)

    /*generate a new variable YOB_New because the command yq requires the first argument to be between 1000 and 9999 and our data for YOB in the 1980 Census is between 30 and 49 instead
    of 1930 and 1949*/

    gen YOB_New=YOB
    replace YOB_New = YOB+1900 if CENSUS==80

    /*generate a time variable that has the format required in help tsset */
    gen tps=yq(YOB_New,QOB)
    format %tq tps

    /*the following instruction returns r451: repeated time values in panel, probably because there are thousand of people born during the same year and same quarter, obviously with the same average number of years of education. How to avoid this error? */
    tsset medstay1 tps

    /*instruction to have the moving average MA but I want the moving of average of medstay1 to be calculated for every value of the timevariable "tps". What command would allow me to do this or is it done automatically?*/
    tssmooth ma MA = medstay1, window (2 0 2)

    I hope someone can help.

    Thanks a lot!

    Comment


    • #3
      medstay1 is a variable containing means (a strange choice of name, but that's secondary). In contrast. if you feed tsset two variables. the first should always be a panel identifier. A mean won't be suitable.

      Last edited by Nick Cox; 26 Mar 2016, 12:47.

      Comment


      • #4
        Thank you for your answer Nick, I am still trying to understand and apply it.

        Comment

        Working...
        X