Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to generate each year's data from a time period?

    Dear fellow stata users,

    Hello!

    I've downloaded the data about S&P 500 constituents from wrds, the variables "from" and "thru" describes the beginning and ending time of when the firm is in S&P 500.
    But how can I find out that through the time period of 1996 to 2014, of every specific year which firms are in the S&P 500?
    (to generate the yearly S&P 500 constituents)

    Here's part of my data

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6 gvkey str4 iid str6 gvkeyx long(from thru) str58 conm str10 indextype str8(tic spii spmi) str10 indexcat str58 co_conm str8 co_tic str10(co_cusip co_cik) str4 co_sic str6 co_naics
    "001078" "01" "000003" 1551 . "S&P 500 Comp-Ltd" "LGCAP" "I0003" "" "10" "S&P" "ABBOTT LABORATORIES"         "ABT" "002824100" "0000001800" "2834" "325412"
    "001300" "01" "000003" 1551 . "S&P 500 Comp-Ltd" "LGCAP" "I0003" "" "10" "S&P" "HONEYWELL INTERNATIONAL INC" "HON" "438516106" "0000773840" "9997" "336413"
    "001440" "01" "000003" 1551 . "S&P 500 Comp-Ltd" "LGCAP" "I0003" "" "10" "S&P" "AMERICAN ELECTRIC POWER CO"  "AEP" "025537101" "0000004904" "4911" "2211"  
    "002285" "01" "000003" 1551 . "S&P 500 Comp-Ltd" "LGCAP" "I0003" "" "10" "S&P" "BOEING CO"                   "BA"  "097023105" "0000012927" "3721" "336411"
    "002403" "01" "000003" 1551 . "S&P 500 Comp-Ltd" "LGCAP" "I0003" "" "10" "S&P" "BRISTOL-MYERS SQUIBB CO"     "BMY" "110122108" "0000014272" "2834" "325412"
    "002663" "01" "000003" 1551 . "S&P 500 Comp-Ltd" "LGCAP" "I0003" "" "10" "S&P" "CAMPBELL SOUP CO"            "CPB" "134429109" "0000016732" "2030" "311422"
    "002817" "01" "000003" 1551 . "S&P 500 Comp-Ltd" "LGCAP" "I0003" "" "10" "S&P" "CATERPILLAR INC"             "CAT" "149123101" "0000018230" "3531" "333120"
    "002991" "01" "000003" 1551 . "S&P 500 Comp-Ltd" "LGCAP" "I0003" "" "10" "S&P" "CHEVRON CORP"                "CVX" "166764100" "0000093410" "2911" "324110"
    "003144" "01" "000003" 1551 . "S&P 500 Comp-Ltd" "LGCAP" "I0003" "" "10" "S&P" "COCA-COLA CO"                "KO"  "191216100" "0000021344" "2086" "312111"
    "003170" "01" "000003" 1551 . "S&P 500 Comp-Ltd" "LGCAP" "I0003" "" "10" "S&P" "COLGATE-PALMOLIVE CO"        "CL"  "194162103" "0000021665" "2844" "325611"
    end
    format %d from
    format %d thru






    Thanks in advance.

    Best,

    Noah




    Last edited by Noah Liu; 22 Jul 2019, 11:39.

  • #2
    Code:
    * Some slightly simplified data example with companies that are in/out of the SP500 for the time range:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6 gvkey str4 iid str6 gvkeyx long(from thru) str58 co_conm
    "001078" "01" "000003" 1551     . "ABBOTT LABORATORIES"        
    "001300" "01" "000003" 1551 15000 "HONEYWELL INTERNATIONAL INC"
    "001440" "01" "000003" 1551 18000 "AMERICAN ELECTRIC POWER CO" 
    "002285" "01" "000003" 1551 20000 "BOEING CO"                  
    end
    format %d from
    format %d thru
    
    * Code
    foreach year_no of numlist 1996(1)2014{
    gen sp500_mem_`year_no'=0
    replace sp500_mem_`year_no'=1 if year(from)<=`year_no' & year(thru)>=`year_no'
    }

    Comment


    • #3
      Originally posted by Jorrit Gosens View Post
      Code:
      * Some slightly simplified data example with companies that are in/out of the SP500 for the time range:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str6 gvkey str4 iid str6 gvkeyx long(from thru) str58 co_conm
      "001078" "01" "000003" 1551 . "ABBOTT LABORATORIES"
      "001300" "01" "000003" 1551 15000 "HONEYWELL INTERNATIONAL INC"
      "001440" "01" "000003" 1551 18000 "AMERICAN ELECTRIC POWER CO"
      "002285" "01" "000003" 1551 20000 "BOEING CO"
      end
      format %d from
      format %d thru
      
      * Code
      foreach year_no of numlist 1996(1)2014{
      gen sp500_mem_`year_no'=0
      replace sp500_mem_`year_no'=1 if year(from)&lt;=`year_no' &amp; year(thru)&gt;=`year_no'
      }
      thank you for your help!
      now I can identify each year's S&amp;P500 firms, but how can I list the firms by each year?
      I tried the reshape command but stata reports there are too many variables.

      Comment


      • #4
        It depends what you mean with 'list'. Do you just want to browse through them, or do you want some list in the output screen, or?

        options include:
        Code:
        list co_conm if list co_conm if sp500_mem_1996==1
        Or:
        Code:
        browse if list co_conm if sp500_mem_1996==1
        Or if you feel like reshaping:
        Code:
        reshape long sp500_mem_, i(co_conm) j(year)
        browse if sp500_mem_==1
        Whether this reshape is a good idea for your further analysis depends on what sort of analysis you plan to do with this data.

        Comment


        • #5
          thank you for your help!
          these codes saved my life
          I've solved my problems now

          Comment

          Working...
          X