Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data problem: trying to solve "repeated time values within panel" by turning from monthly to annual observations.

    Hey everyone, I am a 3rd year Business Economics student working on my thesis regarding firms' CO2 emissions, and I could really use your help. I am using Stata 16.1 on MacOS 10.14.

    I need to merge 3 datasets together: 1 dataset with firm financials, 1 dataset with CEO characteristics, and 1 dataset with sustainability scores of different firms. The sustainability dataset is taken from the Sustainalytics database, and it includes 59 environmental categories (variables starting with the letter "E") alongside other governmental and social categories, that together make up the "ESG" score for each firm. My sample contains data between 2009-2017, and the raw data contains approximately 150,000 firm-year observations.

    Here's the problem: Two other two datasets have annual observations, but the Sustainalytics dataset contains multiple observations per year for each firm. My goal is to take the averages of these observations and turn them into one annual average, so that I can merge the dataset with the other two.

    In the dataex example below, I have removed most of the environmental and governance score variables for simplicity, but my goal is to eventually calculate the overall scores by taking the mean of these variables.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long(Date firmid) str26 Ticker double(E_1_1 E_1_2 G_1_1 G_1_3)
    21094 1 "SLB"  50 60 100 0
    21124 1 "SLB"  50 60 100 0
    21159 1 "SLB"  50 60 100 0
    21193 1 "SLB"  50 60 100 0
    21216 1 "SLB"  50 60 100 0
    21244 1 "SLB"  50 60 100 0
    21276 1 "SLB" 100 80 100 0
    21306 1 "SLB" 100 80 100 0
    21339 1 "SLB" 100 80 100 0
    21368 1 "SLB" 100 80 100 0
    end
    format %td Date
    label values firmid firmid
    label def firmid 1 "AN8068571086", modify

    I have tried to use the collapse command as suggested in other posts about panel data, but every time I've tried, I've ended up dropping most of the important data.
    Thank you in advance!

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long(Date firmid) str26 Ticker double(E_1_1 E_1_2 G_1_1 G_1_3)
    21094 1 "SLB"  50 60 100 0
    21124 1 "SLB"  50 60 100 0
    21159 1 "SLB"  50 60 100 0
    21193 1 "SLB"  50 60 100 0
    21216 1 "SLB"  50 60 100 0
    21244 1 "SLB"  50 60 100 0
    21276 1 "SLB" 100 80 100 0
    21306 1 "SLB" 100 80 100 0
    21339 1 "SLB" 100 80 100 0
    21368 1 "SLB" 100 80 100 0
    end
    format %td Date
    label values firmid firmid
    label def firmid 1 "AN8068571086", modify
    
    gen year= year(Date)
    collapse E_* G_*, by(firmid Ticker year)
    xtset firmid year
    Res.:

    Code:
    . xtset firmid year
           panel variable:  firmid (strongly balanced)
            time variable:  year, 2017 to 2018
                    delta:  1 unit
    
    . l
    
         +----------------------------------------------------------------------+
         |       firmid   Ticker   year       E_1_1       E_1_2   G_1_1   G_1_3 |
         |----------------------------------------------------------------------|
      1. | AN8068571086      SLB   2017          50          60     100       0 |
      2. | AN8068571086      SLB   2018   78.571429   71.428571     100       0 |
         +----------------------------------------------------------------------+

    Comment


    • #3
      Sonja:
      welcome to this forum.
      Do you mean something along the following lines?
      Code:
      . g year_Date=year( Date )
      
      . collapse (mean) E_1_1 E_1_2 G_1_1 G_1_3 , by(firmid year_Date )
      
      . list
      
           +-----------------------------------------------------------------+
           |       firmid   year_D~e       E_1_1       E_1_2   G_1_1   G_1_3 |
           |-----------------------------------------------------------------|
        1. | AN8068571086       2017          50          60     100       0 |
        2. | AN8068571086       2018   78.571429   71.428571     100       0 |
           +-----------------------------------------------------------------+
      
      .
      PS: Crossed in the cyberspace with Andrew's helpful (and more econometrically spiced ) reply!
      Last edited by Carlo Lazzaro; 13 May 2022, 05:13.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Thank you so much for your help Andrew and Carlo! I tried Carlos' code first, because it was closest to what I had tried earlier, and it worked exactly how I wanted!

        Comment

        Working...
        X