Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Lead and Lag variables in Panel data

    Hello everyone,

    I'm trying to create lead and lag variables in an (unbalanced) panel data. What I want to do is quite straightforward, however, I can't seem to find the solution in Stata.

    My unbalanced panel data is the following
    Code:
    xtdes
    
          No:  1, 4, ..., 1109                                   n =        720
        Year:  2013, 2014, ..., 2019                             T =          7
               Delta(Year) = 1 unit
               Span(Year)  = 7 periods
               (No*Year uniquely identifies each observation)
    
    Distribution of T_i:   min      5%     25%       50%       75%     95%     max
                             1       1       2         3         6       7       7
    
         Freq.  Percent    Cum. |  Pattern
     ---------------------------+---------
          163     22.64   22.64 |  ......1
          146     20.28   42.92 |  1111111
          110     15.28   58.19 |  .....11
           95     13.19   71.39 |  ....111
           79     10.97   82.36 |  ...1111
           67      9.31   91.67 |  ..11111
           60      8.33  100.00 |  .111111
     ---------------------------+---------
          720    100.00         |  XXXXXXX
    I have the variable Revenue and I would like to have Revenue x years (let's say 2 years) after a certain event (let's say when the company acquires its own facilities).
    For the firm #4 for example, which experienced the event in 2014, I want the Revenue 2 years after that (in 2016) which is 178057.

    Code:
     list No Year EventYear Revenue in f/10
    
         +---------------------------------+
         | No   Year   EventYear   Revenue |
         |---------------------------------|
      1. |  1   2015       2015          . |
      2. |  1   2016       2015          . |
      3. |  1   2017       2015          . |
      4. |  1   2018       2015          . |
      5. |  1   2019       2015     144851 |
         |---------------------------------|
      6. |  4   2014       2014     213553 |
      7. |  4   2015       2014          . |
      8. |  4   2016       2014     178057 |
      9. |  4   2017       2014     177342 |
     10. |  4   2018       2014          . |
         +---------------------------------+
    My concern is that I don't know how to create the lead and the lags since my time variable of reference is the EventYear while my panel data is set up on the Year variable (not the EventYear)

    Code:
    xtset No Year
           panel variable:  No (unbalanced)
            time variable:  Year, 2013 to 2019
                    delta:  1 unit
    I am actually looking for an equivalent to the vlookup command in Excel - but that would also work with missing values in Stata.

    Would you have any advice on how to go about this?
    Thank you so much in advance!

    Jeanne
    I am using Stata 16.1 for Mac

  • #2
    So you are looking for the revenue two years prior to the event year. That would be:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte id int(year event_year) long revenue
    1 2015 2015      .
    1 2016 2015      .
    1 2017 2015      .
    1 2018 2015      .
    1 2019 2015 144851
    4 2014 2014 213553
    4 2015 2014      .
    4 2016 2014 178057
    4 2017 2014 177342
    4 2018 2014      .
    end
    
    by id, sort: egen revenue_2_yr_b4_event_year = max(cond(year == event_year-2), revenue, .)
    Note: The example data you show does not actually instantiate any data two years before the event year, so the code has not been tested with an affirmative example, but I'm confident it works.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Let me build on Clyde's work. You say you want a variable that gives revenue 2 years after the event year. Clyde assumes you want that value in each observation for the company (just like event_year). I'm thinking you may need it just in the event year. The example below includes Clyde's calculation, changing it to calculate 2 years after the event year, and my version.
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input byte no int(year event_year) long revenue
      1 2015 2015      .
      1 2016 2015      .
      1 2017 2015      .
      1 2018 2015      .
      1 2019 2015 144851
      4 2014 2014 213553
      4 2015 2014      .
      4 2016 2014 178057
      4 2017 2014 177342
      4 2018 2014      .
      end
      
      xtset no year
      by no (year): egen rev2yr_c = max(cond(year == event_year+2), revenue, .)
      generate           rev2yr_w = F2.revenue if year==event_year
      list, sepby(no)
      Code:
      . list, sepby(no)
      
           +------------------------------------------------------+
           | no   year   event_~r   revenue   rev2yr_c   rev2yr_w |
           |------------------------------------------------------|
        1. |  1   2015       2015         .          .          . |
        2. |  1   2016       2015         .          .          . |
        3. |  1   2017       2015         .          .          . |
        4. |  1   2018       2015         .          .          . |
        5. |  1   2019       2015    144851          .          . |
           |------------------------------------------------------|
        6. |  4   2014       2014    213553     178057     178057 |
        7. |  4   2015       2014         .     178057          . |
        8. |  4   2016       2014    178057     178057          . |
        9. |  4   2017       2014    177342     178057          . |
       10. |  4   2018       2014         .     178057          . |
           +------------------------------------------------------+
      Last edited by William Lisowski; 23 Nov 2021, 12:34.

      Comment


      • #4
        Dear Clyde and William,

        Thank you so much for your help! I implemented William's code and it is working perfectly. This is exactly what I needed.
        Thank you so much for your help and your prompt response!

        Best regards,
        Jeanne

        PS: I'll make sure to use the -dataex- command next time!

        Comment

        Working...
        X