Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to find lag of a variable in long format panel dataset?

    I have a dataset akin to this
    User Date Value
    A 2012-01-01 4
    A 2012-01-02 5
    A 2012-01-03 6
    A 2012-01-04 7
    B 2012-01-01 2
    B 2012-01-02 3
    B 2012-01-03 4
    B 2012-01-04 5
    I want to create a lag of Value, respecting User.
    User Date Value Value.lag
    A 2012-01-01 4 NA
    A 2012-01-02 5 4
    A 2012-01-03 6 5
    A 2012-01-04 7 6
    B 2012-01-01 2 NA
    B 2012-01-02 3 2
    B 2012-01-03 4 3
    B 2012-01-04 5 4
    How can I do it in stata?
    Last edited by DILAWAR AHMAD BHAT; 21 Aug 2019, 11:42.

  • #2
    Well, you cannot call the variable Value.lag because only letters, numbers, and the underscore (_) character are legal in Stata variable names. That detail aside, Stata makes working with lags very easy. First you need to create a numeric panel identifier, and then declare your data to be panel data. Then you just apply the lag operator (see -help tsvarlist- for more information about time-series operators).

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 user float date byte value
    "A" 18993 4
    "A" 18994 5
    "A" 18995 6
    "A" 18996 7
    "B" 18993 2
    "B" 18994 3
    "B" 18995 4
    "B" 18996 5
    end
    format %td date
    
    egen n_user = group(user)
    xtset n_user date
    gen lag_value = L1.value
    
    list, noobs sepby(user)
    Note, in fact, for most purposes you don't even need to create this variable at all. If you want to do a regression in which lag_value would have been one of the variables, you can just use L1.value directly in the regression command. Most other Stata commands will allow that as well.

    In the future, when showing example data, please use the -dataex- command, as I have done here. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.
    Last edited by Clyde Schechter; 21 Aug 2019, 11:44.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      Well, you cannot call the variable Value.lag because only letters, numbers, and the underscore (_) character are legal in Stata variable names. That detail aside, Stata makes working with lags very easy. First you need to create a numeric panel identifier, and then declare your data to be panel data. Then you just apply the lag operator (see -help tsvarlist- for more information about time-series operators).

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str1 user float date byte value
      "A" 18993 4
      "A" 18994 5
      "A" 18995 6
      "A" 18996 7
      "B" 18993 2
      "B" 18994 3
      "B" 18995 4
      "B" 18996 5
      end
      format %td date
      
      egen n_user = group(user)
      xtset n_user date
      gen lag_value = L1.value
      
      list, noobs sepby(user)
      Note, in fact, for most purposes you don't even need to create this variable at all. If you want to do a regression in which lag_value would have been one of the variables, you can just use L1.value directly in the regression command. Most other Stata commands will allow that as well.

      In the future, when showing example data, please use the -dataex- command, as I have done here. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      When asking for help with code, always show example data. When showing example data, always use -dataex-.
      Thanks Dear Clyde. Actually I want to create a new variable like EARNINGS_VARIABILITY=(EARNINGS-LAG1_of_EARNNGS)/LAG1_of_EARNINGS. My data is already in long panel format so needed a code to do this. The code you provided above works perfectly for creating LAG1_of_EARNINGS by each company, however, i am not sure if i have to manually write all company names in place of "A" "A" "A"; "B" "B" "B"....I have 2921 companies in my sample

      Comment


      • #4
        Actually I want to create a new variable like EARNINGS_VARIABILITY=(EARNINGS-LAG1_of_EARNNGS)/LAG1_of_EARNINGS. My data is already in long panel format so needed a code to do this.
        So you don't need the LAG1_of_EARNINGS variable. All you need is:
        Code:
        egen n_user = group(user)
        xtset n_user date
        gen earnings_variability = (earnings-L1.earnings)/L1.earnings
        i am not sure if i have to manually write all company names in place of "A" "A" "A"; "B" "B" "B"....I have 2921 companies in my sample
        I don't know what you're talking about. Don't you already have a data set with all of this in it? I think you do not understand the use of -dataex- here. The part of the code that says:
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str1 user float date byte value
        "A" 18993 4
        "A" 18994 5
        "A" 18995 6
        "A" 18996 7
        "B" 18993 2
        "B" 18994 3
        "B" 18995 4
        "B" 18996 5
        end
        format %td date
        is not something you need to run. That is just a way for us here on Statalist to put some example data into memory. Instead of that part of the code, you just -use- your actual data set. Then you run the code that comes after that part on your actual data set.




        Comment


        • #5
          Originally posted by Clyde Schechter View Post

          So you don't need the LAG1_of_EARNINGS variable. All you need is:
          Code:
          egen n_user = group(user)
          xtset n_user date
          gen earnings_variability = (earnings-L1.earnings)/L1.earnings

          I don't know what you're talking about. Don't you already have a data set with all of this in it? I think you do not understand the use of -dataex- here. The part of the code that says:
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str1 user float date byte value
          "A" 18993 4
          "A" 18994 5
          "A" 18995 6
          "A" 18996 7
          "B" 18993 2
          "B" 18994 3
          "B" 18995 4
          "B" 18996 5
          end
          format %td date
          is not something you need to run. That is just a way for us here on Statalist to put some example data into memory. Instead of that part of the code, you just -use- your actual data set. Then you run the code that comes after that part on your actual data set.



          Thank you so much , Dear Clyde Schechter. You are one of the most valuable contributors to this forum

          Comment

          Working...
          X