Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Transform wide Datastream format to long

    Dear all,

    For the purpose of my research in which I wish to analyze the effect of buy/sell recommendations on approximately 100 firms, I downloaded the daily opening and closing prices, the market capitalization and the price-to-book values of said firms from 01/01/2004 until 01/01/2021 using Refinitiv Datastream. This has provided me with the following dataset:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str35 Name str12 Code str17(C D FNR FNS)
    "ING GROEP"                    "H:INGA(P)"    "14.1981"           "14.3977"           "7.641"             "7.641"            
    "ING GROEP"                    "H:INGA(MV)"   "38120.33"          "38656.38"          "29804.99"          "29804.99"         
    "ING GROEP"                    "H:INGA(PO)"   "NA"                "14.2672"           "7.611"             "NA"               
    "ING GROEP"                    "H:INGA(PTBV)" "1.56"              "1.58"              ".55"               ".55"              
    "ARCELORMITTAL"                "H:MT(P)"      "15.6902"           "15.9112"           "18.88"             "18.88"            
    "ARCELORMITTAL"                "H:MT(MV)"     "389.43"            "394.92"            "20821.07"          "20821.07"         
    "ARCELORMITTAL"                "H:MT(PO)"     "NA"                "15.9112"           "19.02"             "NA"               
    "ARCELORMITTAL"                "H:MT(PTBV)"   "1.06"              "1.07"              ".67"               ".67"              
    "ASML HOLDING"                 "H:ASML(P)"    "16.6397"           "17.296"            "397.55"            "397.55"           
    "ASML HOLDING"                 "H:ASML(MV)"   "7579.97"           "7878.93"           "166771.2"          "166771.2"         
    "ASML HOLDING"                 "H:ASML(PO)"   "NA"                "16.9255"           "397.8"             "NA"               
    "ASML HOLDING"                 "H:ASML(PTBV)" "5.46"              "5.68"              "11.94"             "11.94"            
    "BE SEMICONDUCTOR INDUSTRIES"  "H:BESI(P)"    "3.1301"            "3.1161"            "49.58"             "49.58"            
    "BE SEMICONDUCTOR INDUSTRIES"  "H:BESI(MV)"   "213.66"            "212.7"             "3895.39"           "3895.39"          
    "BE SEMICONDUCTOR INDUSTRIES"  "H:BESI(PO)"   "NA"                "3.1208"            "49.37"             "NA"               
    "BE SEMICONDUCTOR INDUSTRIES"  "H:BESI(PTBV)" "1.17"              "1.16"              "9.73"              "9.73"             
    "DELTA LLOYD GROUP DEAD"       "H:DL(P)"      "NA"                "NA"                "5.273"             "5.273"            
    "DELTA LLOYD GROUP DEAD"       "H:DL(MV)"     "NA"                "NA"                "2432.6"            "2432.6"           
    "DELTA LLOYD GROUP DEAD"       "H:DL(PO)"     "NA"                "NA"                "NA"                "NA"               
    "DELTA LLOYD GROUP DEAD"       "H:DL(PTBV)"   "NA"                "NA"                ".75"               ".75"              
    "AKZO NOBEL"                   "H:AKZA(P)"    "30.8226"           "30.9132"           "87.86"             "87.86"            
    "AKZO NOBEL"                   "H:AKZA(MV)"   "8756.1"            "8781.85"           "16953.69"          "16953.69"         
    "AKZO NOBEL"                   "H:AKZA(PO)"   "NA"                "30.7722"           "88.02"             "NA"               
    "AKZO NOBEL"                   "H:AKZA(PTBV)" "2.88"              "2.89"              "2.9"               "2.9"              
    "ROYAL DUTCH SHELL A"          "H:RDSA(P)"    "20.9"              "21"                "14.614"            "14.614"           
    "ROYAL DUTCH SHELL A"          "H:RDSA(MV)"   "87090.25"          "87506.94"          "59935.46"          "59935.46"         
    "ROYAL DUTCH SHELL A"          "H:RDSA(PO)"   "NA"                "20.95"             "14.682"            "NA"               
    "ROYAL DUTCH SHELL A"          "H:RDSA(PTBV)" "2.29"              "2.3"               ".9"                ".9"               
    "SBM OFFSHORE"                 "H:SBMO(P)"    "10.2505"           "10.3983"           "15.565"            "15.565"           
    "SBM OFFSHORE"                 "H:SBMO(MV)"   "1370.34"           "1390.1"            "2936.67"           "2936.67"          
    "SBM OFFSHORE"                 "H:SBMO(PO)"   "NA"                "10.2505"           "15.35"             "NA"               
    "SBM OFFSHORE"                 "H:SBMO(PTBV)" "2.56"              "2.6"               "1.4"               "1.4"              
    "TOM TOM"                      "H:TOM(P)"     "NA"                "NA"                "8.44"              "8.44"             
    "TOM TOM"                      "H:TOM(MV)"    "NA"                "NA"                "1117.17"           "1117.17"          
    "TOM TOM"                      "H:TOM(PO)"    "NA"                "NA"                "8.49"              "NA"               
    "TOM TOM"                      "H:TOM(PTBV)"  "NA"                "NA"                "2.84"              "2.84"             
    "KONINKLIJKE AHOLD DELHAIZE"   "H:AD(P)"      "6.0519"            "6.0119"            "23.11"             "23.11"            
    "KONINKLIJKE AHOLD DELHAIZE"   "H:AD(MV)"     "9377.719999999999" "9315.620000000001" "25437.75"          "25437.75"         
    "KONINKLIJKE AHOLD DELHAIZE"   "H:AD(PO)"     "NA"                "6.062"             "23.35"             "NA"               
    "KONINKLIJKE AHOLD DELHAIZE"   "H:AD(PTBV)"   "2.08"              "2.07"              "1.95"              "1.95"             
    "GRANDVISION"                  "H:GVNV(P)"    "NA"                "NA"                "25.5"              "25.5"             
    "GRANDVISION"                  "H:GVNV(MV)"   "NA"                "NA"                "6488.31"           "6488.31"          
    "GRANDVISION"                  "H:GVNV(PO)"   "NA"                "NA"                "25.8"              "NA"               
    "GRANDVISION"                  "H:GVNV(PTBV)" "NA"                "NA"                "6.11"              "6.11"             
    "HAL TRUST"                    "H:HAT(P)"     "19.4082"           "20.1375"           "117"               "117"              
    "HAL TRUST"                    "H:HAT(MV)"    "1525.3"            "1582.62"           "9990.049999999999" "9990.049999999999"
    "HAL TRUST"                    "H:HAT(PO)"    "NA"                "19.2056"           "117.2"             "NA"               
    "HAL TRUST"                    "H:HAT(PTBV)"  ".9"                ".93"               "1.25"              "1.25"             
    "POSTNL"                       "H:PNL(P)"     "7.3434"            "7.4027"            "2.79"              "2.79"             
    "POSTNL"                       "H:PNL(MV)"    "8918.42"           "8990.459999999999" "1381.03"           "1381.03"          
    "POSTNL"                       "H:PNL(PO)"    "NA"                "7.3553"            "2.78"              "NA"               
    "POSTNL"                       "H:PNL(PTBV)"  "3.14"              "3.17"              "6.28"              "6.28"             
    "AIR FRANCE-KLM"               "F:UTA(P)"     "11.8219"           "12.0944"           "4.9818"            "4.9818"           
    "AIR FRANCE-KLM"               "F:UTA(MV)"    "2670.34"           "2731.88"           "2194.61"           "2194.61"          
    "AIR FRANCE-KLM"               "F:UTA(PO)"    "NA"                "11.8511"           "5.0421"            "NA"               
    "AIR FRANCE-KLM"               "F:UTA(PTBV)"  ".65"               ".67"               "-.4"               "-.4"              
    "AMG ADVD.METALLURGICAL GROUP" "H:AMG(P)"     "NA"                "NA"                "24.46"             "24.46"            
    "AMG ADVD.METALLURGICAL GROUP" "H:AMG(MV)"    "NA"                "NA"                "767.25"            "767.25"           
    "AMG ADVD.METALLURGICAL GROUP" "H:AMG(PO)"    "NA"                "NA"                "24.81"             "NA"               
    "AMG ADVD.METALLURGICAL GROUP" "H:AMG(PTBV)"  "NA"                "NA"                "7.36"              "7.36"             
    "BOSKALIS WESTMINSTER"         "H:BOSK(P)"    "5.1229"            "5.0837"            "22.54"             "22.54"            
    "BOSKALIS WESTMINSTER"         "H:BOSK(MV)"   "553.42"            "549.1900000000001" "2936.46"           "2936.46"          
    "BOSKALIS WESTMINSTER"         "H:BOSK(PO)"   "NA"                "5.1229"            "22.38"             "NA"               
    "BOSKALIS WESTMINSTER"         "H:BOSK(PTBV)" "1.27"              "1.26"              "1.29"              "1.29"             
    "FUGRO"                        "H:FUR(P)"     "13.1699"           "13.1538"           "7.6"               "7.6"              
    "FUGRO"                        "H:FUR(MV)"    "577.91"            "577.2"             "784.25"            "784.25"           
    "FUGRO"                        "H:FUR(PO)"    "NA"                "13.1732"           "7.595"             "NA"               
    "FUGRO"                        "H:FUR(PTBV)"  "2.76"              "2.75"              "1.15"              "1.15"             
    "HEIJMANS"                     "H:HEIJ(P)"    "51.2268"           "51.9777"           "9.33"              "9.33"             
    "HEIJMANS"                     "H:HEIJ(MV)"   "428.57"            "434.85"            "204.64"            "204.64"           
    "HEIJMANS"                     "H:HEIJ(PO)"   "NA"                "51.2268"           "9.25"              "NA"               
    "HEIJMANS"                     "H:HEIJ(PTBV)" ".9399999999999999" ".95"               ".9"                ".9"               
    "ABN AMRO BANK"                "H:ABN(P)"     "NA"                "NA"                "8.02"              "8.02"             
    "ABN AMRO BANK"                "H:ABN(MV)"    "NA"                "NA"                "7538.8"            "7538.8"           
    "ABN AMRO BANK"                "H:ABN(PO)"    "NA"                "NA"                "7.94"              "NA"               
    "ABN AMRO BANK"                "H:ABN(PTBV)"  "NA"                "NA"                ".4"                ".4"               
    "DSM KONINKLIJKE"              "H:DSM(P)"     "19.515"            "19.645"            "140.8"             "140.8"            
    "DSM KONINKLIJKE"              "H:DSM(MV)"    "3940.47"           "3966.72"           "25544.63"          "25544.63"         
    "DSM KONINKLIJKE"              "H:DSM(PO)"    "NA"                "19.515"            "142.95"            "NA"               
    "DSM KONINKLIJKE"              "H:DSM(PTBV)"  ".79"               ".79"               "3.28"              "3.28"             
    "AEGON"                        "H:AGN(P)"     "10.6679"           "10.8134"           "3.235"             "3.235"            
    "AEGON"                        "H:AGN(MV)"    "17609.29"          "17849.49"          "6810.12"           "6810.12"          
    "AEGON"                        "H:AGN(PO)"    "NA"                "10.7315"           "3.2"               "NA"               
    "AEGON"                        "H:AGN(PTBV)"  "1.25"              "1.27"              ".29"               ".29"              
    "UNILEVER DUTCH CERT. "        "H:UNIL(P)"    "17.2831"           "17.2498"           "53.56"             "53.56"            
    "UNILEVER DUTCH CERT. "        "H:UNIL(MV)"   "29636.21"          "29579.05"          "91840.63"          "91840.63"         
    "UNILEVER DUTCH CERT. "        "H:UNIL(PO)"   "NA"                "17.2332"           "NA"                "NA"               
    "BRUNEL INTL."                 "H:BRU(P)"     "2.45"              "2.5"               "7.3"               "7.3"              
    "BRUNEL INTL."                 "H:BRU(MV)"    "110.99"            "113.25"            "369.19"            "369.19"           
    "BRUNEL INTL."                 "H:BRU(PO)"    "NA"                "2.47"              "7.2"               "NA"               
    "BRUNEL INTL."                 "H:BRU(PTBV)"  "1.4"               "1.43"              "1.35"              "1.35"             
    "ARCADIS"                      "H:HDJ(P)"     "3.1133"            "3.1167"            "27.04"             "27.04"            
    "ARCADIS"                      "H:HDJ(MV)"    "189.45"            "189.66"            "2445.55"           "2445.55"          
    "ARCADIS"                      "H:HDJ(PO)"    "NA"                "3.1133"            "27.02"             "NA"               
    "ARCADIS"                      "H:HDJ(PTBV)"  "1.3"               "1.3"               "2.67"              "2.67"             
    "TEN CATE"                     "H:NTC(P)"     "7.7932"            "7.817"             "26"                "26"               
    "TEN CATE"                     "H:NTC(MV)"    "173.16"            "173.69"            "713.8099999999999" "713.8099999999999"
    "TEN CATE"                     "H:NTC(PO)"    "NA"                "7.7954"            "NA"                "NA"               
    "TEN CATE"                     "H:NTC(PTBV)"  "1.04"              "1.04"              "1.4"               "1.4"                          
    end

    In my dataset, the dates from 01/01/2004 until 01/01/2021 are horizontally formatted, ranging from column C to column FNS in Stata. For which I seek your help is to reshape my data from a wide to a long format, in which for each firm identified by Name (e.g. "ING GROEP") the price, opening price, market value and price-to-book value are denoted at date x.
    Furthermore, as can be seen in my dataex example, currently "Code" specifies the price, opening price, market value and price-to-book of the firms each with their individual datastream company code (e.g. H:INGA(P), H:INGA (MV), H:INGA(PO) and H:INGA(PTBV) respectively). I want to create general variables for the closing price (P), market value (MV), opening price (Price O) and the price-to-book value (PTBV).

    In some cases firms are either not listed yet or not listed anymore at within my investigation period. Missong values of these firms for the four variables are noted as "NA" and can be treated as missing values.

    Thank you all in advance!

    Grateful,

    Robert

  • #2
    Code:
    * Convert "NA" made Stata imported the number as string, replace them with numeric format:
    destring C-FNS, replace force
    
    * Add a leading "t" to denote the set when reshaping
    foreach x of varlist C-FNS{
        rename `x' t`x'
    }
    
    * First convert to long:
    reshape long t, i(Name Code) j(date, string)
    
    * Extract the variable name from Code:
    rename Code Code_old
    gen Code = regexs(1) if regexm(Code_old, "\(([A-Z]+)\)")
    drop Code_old
    
    * Then covert back to wide:
    reshape wide t, i(Name date) j(Code, string)
    
    * Remove the leading "t"
    foreach x of varlist t* {
           local newx = substr("`x'", 2, .)
           rename `x' `newx'
    }
    
    * Results:
    list in 1/12, sep(0)
    Results:
    Code:
         +-------------------------------------------------------------+
         |           Name   date         MV         P        PO   PTBV |
         |-------------------------------------------------------------|
      1. |  ABN AMRO BANK      C          .         .         .      . |
      2. |  ABN AMRO BANK      D          .         .         .      . |
      3. |  ABN AMRO BANK    FNR     7538.8      8.02      7.94     .4 |
      4. |  ABN AMRO BANK    FNS     7538.8      8.02         .     .4 |
      5. |          AEGON      C   17609.29   10.6679         .   1.25 |
      6. |          AEGON      D   17849.49   10.8134   10.7315   1.27 |
      7. |          AEGON    FNR    6810.12     3.235       3.2    .29 |
      8. |          AEGON    FNS    6810.12     3.235         .    .29 |
      9. | AIR FRANCE-KLM      C    2670.34   11.8219         .    .65 |
     10. | AIR FRANCE-KLM      D    2731.88   12.0944   11.8511    .67 |
     11. | AIR FRANCE-KLM    FNR    2194.61    4.9818    5.0421    -.4 |
     12. | AIR FRANCE-KLM    FNS    2194.61    4.9818         .    -.4 |
         +-------------------------------------------------------------+
    The following are all untested. The date is a bit of a guess work. Column C to Column FNS spans 4437 columns, which are all the weekdays in this 17-year span. I don't have any brilliant idea here, other than creating a calendar and then merge that back. For example:

    Code:
    clear
    set obs 6211
    gen d1 = _n + 16070
    format d1 %td
    
    gen dow = dow(d1)
    drop if inlist(dow, 0, 6)
    
    gen seq = _n
    save someRefDateFile, replace
    Back to the full data, each company should all have 4437 rows, with that, create a seq for them, and then use a m:1 merge:

    Code:
    bysort Name (date): gen seq = _n
    merge m:1 seq using someRefDateFile
    Last edited by Ken Chui; 18 May 2021, 07:19.

    Comment


    • #3
      Thank you so much for your help Ken, this worked perfectly!

      Comment

      Working...
      X