Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping Variables Based on Other Variable, Student Project

    Hey there,

    I'm a student working with Stata and have a data set that contains a variable associated with countries denoting their GDP from last year. However, I need the data to denote this year's GDP. If I "by country: gen GDP = gdplag[_n+1]" and then "drop if GDP == ." It successfully moves all gdplag observations one year to give me GDP while dropping the observations that don't have a gdplag. So I'm almost there. The problem is some of the years in the data set are nonconsecutive. For example, I have the years 1993, 1994, 1996, and 1997. Years 1993 and 1996 end up with accurate GDPs because I can obtain them from the gdplag of 1994 and 1997 respectively. Further, the GDP observation for 1997 is dropped since I don't have gdplag for 1998. However, the gdplag for 1996 becomes the GDP for 1994 when it is actually the GDP for 1995. Is it possible to make sure that gdplag is moved to correspond with the appropriate year? This is my best guess but I get an invalid syntax (r198) error: "by country: gen GDP = gdplag[_n+1] if year[_n] - year[_n-1] = 1"

  • #2
    Your last approach should work. The syntax error is because the = just before the final 1 needs to be ==.

    That said, there is a much better way to do this.

    Code:
    xtset country year
    gen GDP = F1.gdplag
    Read -help tsvarlist- for more details on time-series operators. The point is that the F1 operator will take care of all the details, including assurance that the correct (consecutive) subsequent year is used. Using the time series operator instead of hand-coding the subscripts is safer because it will not allow you to forget to include corrections for skipped years; it handles them automatically for you.

    Added: And depending on what you want to do with this new variable, you may not need to even create it at all. If all you want to do is use it in regressions, you don't need to create a new variable you can just:

    Code:
    regression_command dep_var ... F1.gdplag...
    All official Stata estimation commands support this, and a few non-estimation commands do as well. With user-written commands it is more variable.
    Last edited by Clyde Schechter; 21 Nov 2021, 12:36.

    Comment


    • #3
      Clyde Schechter Thank you so much for that!

      Comment

      Working...
      X