Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Convert to Panel Data format

    I have a pooled cross-sectional data that I want to present in a panel data format. Here is a sample of my data
    but first let me explain the variables, Date is the month, all the other columns are codes for firm name (the values are the returns for that firm/stock). on the average I have over 200 firms per country; NB: this is for just one country, Germany, I have 7 more.
    it will take forever if am to manually copy and paste, in order to have the returns for each stock in each month (in the panel format), so I need a way to convert it more efficiently. thanks
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int Date str18 DMTX double DRHM str18 DSFQ double DDAI
    16102 "-"                     .14869987152670872 "-"                     .03498227770072424
    16131 "-"                    .048443450290235376 "-"                    -.05985705777248357
    16162 "-"                     .14620599670278336 "-"                    -.04577130186886285
    16192 "-"                     .07086983609331364 "-"                     .14257261410788388
    16223 "-"                    -.04539349407801651 "-"                   -.042417199302730996
    16253 "-"                     .06760366271152019 "-"                      .0526395631067961
    16284 "-"                     -.0574380514757703 "-"                   -.027669693039342868
    16315 "-"                     .04531859849561087 "-"                    -.06625166740773676
    16345 "-"                     .04633663083052579 "-"                   -.005714285714285706
    16376 "-"                                      0 "-"                    -.04054916985951467
    16406 "-"                     .11000061525420259 "-"                    .028785357737104774
    16437 "-"                     .01029118320892767 "-"                     .04997573993207187
    16468 "-"                    .020384899081338323 "-"                  -.0056993222427603905
    16496 "-"                     .08888981820571502 "-"                  -.0048024786986832265
    16527 "-"                    -.03921192103848223 "-"                   -.028642590286425847
    16557 "-"                    -.05608757473489457 "-"                    -.06378205128205124
    16588 "-"                    .048306986817667534 "-"                     .08661417322834637
    16618 "-"                    .014987130804122777 "-"                     .02079395085066163
    16649 "-.0105203066383951"     .1123272249671649 "-"                      .1828530465949821
    16680 "-.0698096101541251"     .0598461420502466 "-"                    .018561786085150474
    16710 ".1118199539252171"     .12731078454541522 "-"                     .09264687141582766
    16741 "-.0637551801083838"  -.055557457003508226 "-"                    -.08432470258922312
    16771 ".0761831801157644"    -.04050209735723921 "-"                     .05464272067252569
    16802 "-.0139207466582299"    .07839462167100211 "-"                    .010386473429951684
    16833 ".0648913130664955"      .1278620774194678 "-"                     .09562514941429597
    16861 ".0122024706236819"     .07734527839021219 "-"                   -.013746454287584453
    16892 "-.0166691471945231"  -.003527292379751383 "-"                    .016482300884955694
    16922 ".0328439533827759"    -.01185590680150846 "-"                    -.06355424964631629
    16953 "-.1038980070339976"    -.1129926964565374 "-"                     -.0634514816966879
    16983 ".053965658217498"     -.01602643433493388 "-"                   -.049633949621541135
    17014 ".0367726920093094"    -.07692210944672931 "-"                    .017756887322104706
    17045 "-.0219245734809936"    .02999831954909423 "-"                     .04605516356638875
    17075 ".1005278861602019"     .08870996190207414 "-"                   -.049546234976698626
    17106 ".094751477233229"    -.024305189913406845 "-"                      .1587096774193548
    17136 "-.0291465582931166"  -.040680611537519555 "-"                    -.04677060133630293
    17167 ".1409510105304468"     .07285559127827222 "-"                     .08084112149532713
    17198 ".142857142857143"       .1048245932450855 "-"                     .03091223519239083
    17226 "-.0490068218298556"   -.07407420928127109 "-"                     .05724470538897052
    17257 ".1404609947782056"      .1653275962676472 "-"                      .1828530465949821
    17287 "-.0086023494588844"    .03222521968020319 "-"                    -.01853357177694685
    17318 ".0854637059152827"    .018675048447922088 "-"                     .13914195792612222
    17348 ".0261303077187553"    -.06203160270880362 "-"                   .0026901265086520616
    17379 ".0632015412966996"    -.05043617967681544 "-"                   -.036183017910231374
    17410 "-.0849320464841442"    -.0563312335260681 "-.1305678699598901"   .022344267228408057
    17440 "-.1082698351198933"   -.05514847523930885 "-.0894743231759136"     .0749871219368609
    17471 ".0120208554600754"     .03862665992307601 "-.0316000000000001"    .03388554216867462
    17501 "-.1520297667318609"   -.07024676398310369 "-.0557620817843866"   -.07124412368403622
    17532 ".0966471647164717"   -.025424279334145306 ".0352872557596967"    -.04690953161759472
    17563 "-.0720221606648199"   -.09923428790441524 "-.1290140845070422"   -.18370857954970457
    17592 "-.1081813156440022"   -.07350914354956564 "-.0663001293661061"   .032804911573352864
    17623 "-.1265108783239325"  -.008306543399692548 "-.0483200554208521"  -.017744654422855118
    17653 ".0370422934998581"     .07559649652672923 ".0223839854413102"    -.05473760274591264
    17684 "-.0784863829204872"    .06022145471223047 ".1901032395870416"   -.030291447682752045
    17714 "-.2426672607113685"   -.12910843713859438 "-.132814836972779"    -.20683878596767843
    17745 "-.0392195313266007"   -.11778182131507327 "-.1586754053121766"  -.042986706423158084
    17776 ".211438544080693"      .13531044071263193 "-.1213612136121362"    .10658185122679467
    17806 "-.178450907922224"    -.20431554923561948 "-.2186187587494166"   -.18864382918817454
    17837 "-.1771149144254279"   -.28445139789870005 "-.7823230815168708"   -.19144013880855978
    17867 "-.0572854765866414"    -.3102035374633365 ".037037037037037"     -.14234620886981403
    17898 ".211438544080693"       .1653275962676472 "-.123015873015873"      .1551292743953294
    17929 ".1268193384223918"     .08875351360880691 "-.2232277526395173"   -.19314079422382666
    17957 "-.0866227079757925"    .04317905815679395 "-.5300970873786408"    -.1807606263982104
    17988 "-.0943433544303798"   .007347044366834804 "1.042285155021619"     .11305297651556528
    18018 ".211438544080693"       .1653275962676472 ".6377749029754204"      .1828530465949821
    18049 "-.0472001223990208" -.0016751466622182013 "-.0821484992101107"    .03703703703703714
    18079 ".1287033319951828"    .034422497162789764 ".1273666092943202"    -.05714285714285722
    18110 "-.0299473609332765"    .11568879525614008 ".4389312977099238"      .1828530465949821
    18141 ".1292073036591626"   -.044884440663855274 "-.6132625994694961"   -.09396991858700154
    18171 ".130333138515488"       .1653275962676472 ".5706447187928668"     .09503502893694805
    18202 "-.0367114788004137"    -.0890596378608829 ".211353711790393"    -.028094575799721976
    18232 ".1295401681875112"     .15083785627428312 ".2804614275414566"      .0827132226674299
    18263 ".0688526321347485"     .07863865494242679 "-.1762387387387388"   .056304520222046066
    18294 "-.0148693375487824"    .04073187398158358 "-.1196172248803828"   -.09547047047047041
    18322 ".0196068598936918"    .029187997887922205 "-.0939440993788821"   -.06902752801217331
    18353 ".1479368514237938"      .1266818944287805 ".1456726649528705"     .12451708766716212
    18383 "-.0304185767533524"  .0013027226507436067 "1.042285155021619"     .10795454545454528
    18414 ".0593875657284256"    -.16116926410863777 ".0192098586444364"    .048896839594514116
    18444 ".0224400417101147"     .02567402567402569 "-.0430298719772404" -.0019329164297896726
    18475 ".005344103128952"      .02775237053413436 ".234113712374582"     .013670539986329493
    18506 "-.0024346696964778"   -.03610831600366728 "-.0593194820837098"  -.027421892560125845
    18536 "-.0779775463716239"    .08107882392788045 "0"                     .12537554887913097
    18567 ".0519257069748975"     .06829437868552746 "-.0345710627400768"    .04938905431769178
    18597 ".055569535312867"     -.01864461933400016 ".0175729442970823"     .09207436399217217
    end
    format %tdnn/dd/CCYY Date
    Simply put I want the data to appear in this format. I have other variables that I will be adding to it once I have it in the format below, I can figure those out. (am actually running a fama macbeth 2 step regression: I hinted this incase there is an alternate approach to run it without converting to panel data format)

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int Date str5 firm str18 returns
    16102 "D:MTX" "-.5833317982536934"
    16131 "D:MTX" ".6017778260547225"
    16162 "D:MTX" "-.56519469614012"  
    16192 "D:MTX" ".3299840801732912"
    16223 "D:MTX" ".0998704891976217"
    16253 "D:MTX" "-.109365661323271"
    16284 "D:MTX" "-.1848639336545106"
    16315 "D:MTX" ".0995035879288312"
    16345 "D:MTX" "0"                
    16376 "D:MTX" "0"                
    16406 "D:MTX" "0"                
    16437 "D:MTX" "-.2416575400639235"
    16468 "D:MTX" "0"                
    16496 "D:MTX" "0"                
    16527 "D:MTX" "0"                
    16557 "D:MTX" "0"                
    16588 "D:MTX" "0"                
    16618 "D:MTX" "0"                
    16649 "D:MTX" "0"                
    16680 "D:MTX" "-.0666686315541278"
    16710 "D:MTX" ".0714308270439259"
    16741 "D:MTX" "0"                
    16771 "D:MTX" ".6017778260547225"
    16802 "D:MTX" "-.2708337445148388"
    16833 "D:MTX" "0"                
    16861 "D:MTX" "0"                
    16892 "D:MTX" "0"                
    16922 "D:MTX" "0"                
    16953 "D:MTX" ".6017778260547225"
    16983 "D:MTX" "-.2710025752401162"
    17014 "D:MTX" ".3717469582685819"
    17045 "D:MTX" "0"                
    17075 "D:MTX" "-.4682515998541065"
    17106 "D:MTX" "0"                
    17136 "D:MTX" ".6017778260547225"
    17167 "D:MTX" "0"                
    17198 "D:MTX" ".4782577355752367"
    17226 "D:MTX" ".3088248201520701"
    17257 "D:MTX" "-.1091023341993234"
    17287 "D:MTX" "-.3113871116129302"
    17318 "D:MTX" ".1172136258006008"
    17348 "D:MTX" ".0734433999869544"
    17379 "D:MTX" "0"                
    17410 "D:MTX" ".3013155718196887"
    17440 "D:MTX" "0"                
    17471 "D:MTX" ".0046928869945548"
    17501 "D:MTX" "-.1472968297330655"
    17532 "D:MTX" "0"                
    17563 "D:MTX" ".1672632025193798"
    17592 "D:MTX" "0"                
    17623 "D:MTX" "0"                
    17653 "D:MTX" "0"                
    17684 "D:MTX" "0"                
    17714 "D:MTX" "0"                
    17745 "D:MTX" "0"                
    17776 "D:MTX" "0"                
    17806 "D:MTX" "-.1374271357305364"
    17837 "D:MTX" "-.104354259523251"
    17867 "D:MTX" "0"                
    17898 "D:MTX" ".0633470336194276"
    17929 "D:MTX" "0"                
    17957 "D:MTX" "0"                
    17988 "D:MTX" "0"                
    18018 "D:MTX" "0"                
    18049 "D:MTX" "0"                
    18079 "D:MTX" "0"                
    18110 "D:MTX" "0"                
    18141 "D:MTX" ".1000003157871469"
    18171 "D:MTX" ".1024671581461577"
    18202 "D:MTX" "0"                
    18232 "D:MTX" "0"                
    18263 "D:MTX" "-.1084944105783678"
    18294 "D:MTX" "0"                
    18322 "D:MTX" "0"                
    18353 "D:MTX" "0"                
    18383 "D:MTX" "0"                
    18414 "D:MTX" "0"                
    18444 "D:MTX" "0"                
    18475 "D:MTX" "0"                
    18506 "D:MTX" "0"                
    18536 "D:MTX" "0"                
    18567 "D:MTX" ".0306574289352853"
    18597 "D:MTX" "0"                
    16102 "D:RHM" "-.0627988007454825"
    16131 "D:RHM" ".0621649662804772"
    16162 "D:RHM" "-.0406186406186406"
    16192 "D:RHM" ".2005695974536714"
    16223 "D:RHM" "-.0792183231771381"
    16253 "D:RHM" ".1446067244789289"
    16284 "D:RHM" ".0023260450588157"
    16315 "D:RHM" "-.0364673120275825"
    16345 "D:RHM" ".011010184420589"  
    16376 "D:RHM" ".0281786005989655"
    16406 "D:RHM" ".0265457434132132"
    16437 "D:RHM" "-.0651963629328689"
    16468 "D:RHM" ".1025110375275937"
    16496 "D:RHM" ".015454886747591"  
    16527 "D:RHM" "-.0222441308768253"
    16557 "D:RHM" "-.0525586085202924"
    16588 "D:RHM" ".0832779034189172"
    end
    format %tdnn/dd/CCYY Date

  • #2
    The key is that you want to use the reshape long command. Below I demonstate its use on your example data, and add some other changes to the data that you will want to make for your time series analysis. The results do not match the results you show in your post, but then, the value "-.5833317982536934" is found nowhere in your example data, so I'm unsure about that.
    Code:
    // fix the variable that were left as sting because "-" isn't a number
    destring DMTX DSFQ, replace force
    // make a Stata Internal Format monthly date
    generate Month = mofd(Date)
    format Month %tm
    drop Date
    // give variables a common prefix
    rename (DMTX-DDAI) (value=)
    reshape long value, i(Month) j(Firm) string
    replace Firm = substr(Firm,1,1)+":"+substr(Firm,2,3)
    sort Firm Month value
    list in 1/12, clean
    Code:
    . list in 1/12, clean
    
             Month    Firm        value  
      1.    2004m2   D:DAI    .03498228  
      2.    2004m3   D:DAI   -.05985706  
      3.    2004m4   D:DAI    -.0457713  
      4.    2004m5   D:DAI    .14257261  
      5.    2004m6   D:DAI    -.0424172  
      6.    2004m7   D:DAI    .05263956  
      7.    2004m8   D:DAI   -.02766969  
      8.    2004m9   D:DAI   -.06625167  
      9.   2004m10   D:DAI   -.00571429  
     10.   2004m11   D:DAI   -.04054917  
     11.   2004m12   D:DAI    .02878536  
     12.    2005m1   D:DAI    .04997574

    Comment


    • #3
      Dear William Lisowski

      This was insanely helpful, every single line of code ran perfectly; however I noticed that the reshape and/or sort command arranged the Firms in alphabetical order. e.g there are firms that starts with a C and those automatically came first, same for alphanumeric, D4A coming before DAA, and that is not exactly what I want, I would like to retain the order of the firms from my data. can you help with that? I don't know if my question is clear enough or maybe I should share my entire dataset.

      Apart from that, every other thing was perfect, thanks once again

      Comment


      • #4
        In this case, I rely on the rename command to add a sequence number to the variable names. For further information on this, see
        Code:
        help rename
        help rename group
        Code:
        // fix the variable that were left as sting because "-" isn't a number
        destring DMTX DSFQ, replace force
        // make a Stata Internal Format monthly date
        generate Month = mofd(Date)
        format Month %tm
        drop Date
        // give variables a common prefix and a sequential suffix
        rename (DMTX-DDAI) (value=_#), addnumber
        reshape long value, i(Month) j(Firm) string
        // use the suffix to generate the sequence
        generate sequence = real(substr(Firm,strpos(Firm,"_")+1,.))
        // extract the firm code and add the colon after the first character
        replace Firm = substr(Firm,1,1)+":"+substr(Firm,2,3)
        order Month sequence Firm value
        sort sequence Month value
        list in 12/23, clean
        Code:
        . list in 12/23, clean
        
                 Month   sequence    Firm        value  
         12.    2005m1          1   D:MTX            .  
         13.    2005m2          1   D:MTX            .  
         14.    2005m3          1   D:MTX            .  
         15.    2005m4          1   D:MTX            .  
         16.    2005m5          1   D:MTX            .  
         17.    2005m6          1   D:MTX            .  
         18.    2005m7          1   D:MTX            .  
         19.    2005m8          1   D:MTX   -.01052031  
         20.    2005m9          1   D:MTX   -.06980961  
         21.   2005m10          1   D:MTX    .11181995  
         22.   2005m11          1   D:MTX   -.06375518  
         23.   2005m12          1   D:MTX    .07618318

        Comment


        • #5
          Hello Sir, this worked perfectly, but it appended the reference number to Firm names. There are other variables that I would be adding conditionally based on the name of the Firm (ie. Sin and Non-Sin Stock, just so you have an idea). When using the substr,I used dot "." to get the full name (some firm names are longer than the others so using 3 was cutting off sum alphabets from those longer names), and that was how I saw that the sequence number was appended.
          so here are the lines of code I used as I included more firms to my dataset as will be shared below
          Code:
           destring DMTX-DVOW, replace force
          
          generate Month = mofd(Date)
          format Month %tm
           
          rename (DMTX-DVOW) (value=_#), addnumber
          reshape long value, i(Month) j(Firm) string
          
          generate sequence = real(substr(Firm,strpos(Firm,"_")+1,.))
          
          replace Firm = substr(Firm,1,1)+":"+substr(Firm,2,.)
          order Month sequence Firm value
          sort sequence Month value
          this is a sample of how I tried to create a new variable dependent on the name of the firm
          Code:
           gen DSIN = 1 if Firm == "D:DEX" | Firm == "D:ZIL2"    | Firm == "D:GMM" | Firm == "D:MTX"
          replace DSIN=2 if DSIN !=1
          the intent is to indicate SIN and Non-Sin stock

          I think this is necessary at this point that I show a bigger portion of my dataset
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str18 DMTX double DVOW
          "-"                    -.08455828026598182
          "-"                   -.052952389736967594
          "-"                    -.06372099440787604
          "-"                     .06564780942804142
          "-"                    -.05488444737291654
          "-"                    -.03115747435102925
          "-"                   -.004388614477097521
          "-"                    -.07195502810743294
          "-"                   -.006326558083187439
          "-"                     .12579827379339706
          "-"                    -.03394958272095162
          "-"                  -.0002935969181219826
          "-"                     .09402310307388342
          "-"                    .010176440442199276
          "-"                    -.03392601182135888
          "-"                    -.07807016812676615
          "-"                     .12288093451353045
          "-"                    .049253587129812905
          "-.0105203066383951"    .14810722041884417
          "-.0698096101541251"  -.027939122123373253
          ".1118199539252171"     .20024731846703334
          "-.0637551801083838"   -.11829102110458387
          ".0761831801157644"   .0072859981557723205
          "-.0139207466582299"  -.011829856365558765
          ".0648913130664955"      .1075149561915697
          ".0122024706236819"     .15292087110702704
          "-.0166691471945231"    .11041737984171353
          ".0328439533827759"   -.038931220436935254
          "-.1038980070339976"   -.08781496713282365
          ".053965658217498"    .0038156920993654415
          ".0367726920093094"     .05216568369096916
          "-.0219245734809936"    .08779375152186772
          ".1005278861602019"     .06029693924166276
          ".094751477233229"      .15776026609334826
          "-.0291465582931166"   .048731927878979515
          ".1409510105304468"    .056665838685639285
          ".142857142857143"    -.003139912485433798
          "-.0490068218298556"    .09090970341324067
          ".1404609947782056"      .1974739832628231
          "-.0086023494588844"   .005338105823435011
          ".0854637059152827"    .018594426545730586
          ".0261303077187553"     .03069023437598379
          ".0632015412966996"     .14743063496251987
          "-.0849320464841442"    .14325046579717857
          "-.1082698351198933"    .05341973071285296
          ".0120208554600754"     .18916510484812052
          "-.1520297667318609"   -.15907387803166373
          ".0966471647164717"   -.031392552777679585
          "-.0720221606648199"   -.03170720576826538
          "-.1081813156440022"   .010628611286537049
          "-.1265108783239325"     .1857215072757897
          ".0370422934998581"    .055723448337704144
          "-.0784863829204872"    -.0875799832508292
          "-.2426672607113685"    .03580731824825863
          "-.0392195313266007"    .09021490626428899
          ".211438544080693"     .050478200064502377
          "-.178450907922224"     .20024731846703334
          "-.1771149144254279"    .20024731846703334
          "-.0572854765866414"   -.31882228621282793
          ".211438544080693"     -.07980360453719718
          ".1268193384223918"     .01659277651706597
          "-.0866227079757925"    -.2661940437951008
          "-.0943433544303798"    .20024731846703334
          ".211438544080693"      .03458224072096673
          "-.0472001223990208"    -.0511054136188269
          ".1287033319951828"        .08375447345769
          "-.0299473609332765"    .03817434940430805
          ".1292073036591626"     -.5137358361520368
          ".130333138515488"     -.12150425901674082
          "-.0367114788004137"   .004849361885555082
          ".1295401681875112"    -.25671293459165756
          ".0688526321347485"     -.0620965664130266
          "-.0148693375487824"   -.13410788260157713
          ".0196068598936918"    -.02955521245320545
          ".1479368514237938"     .12024775736864604
          "-.0304185767533524"   .005653586506099003
          ".0593875657284256"   -.015607951775530317
          ".0224400417101147"  -.0066267333270444905
          ".005344103128952"     .062099882091335615
          "-.0024346696964778"  .0015013057261961831
          "-.0779775463716239"    .07243077564016716
          ".0519257069748975"     .19270016787602154
          ".055569535312867"      .17308536780422545
          end

          You've been so helpful to me, I appreciate sir

          Comment


          • #6
            Change this
            Code:
            replace Firm = substr(Firm,1,1)+":"+substr(Firm,2,3)
            to this
            Code:
            replace Firm = substr(Firm,1,1)+":"+substr(Firm,2,strpos(Firm,"_")-2)
            in the code example from post #4 in order to handle variable names longer than 4 characters.
            Last edited by William Lisowski; 21 Oct 2020, 07:28.

            Comment


            • #7
              Wow this is incredible! You have been a blessing to me. Thank you so very much.
              I only started using Stata a few days ago, a couple weeks maybe. I found the tool interesting.
              How can I improve on my skills, so I can bless others going forward at the very least?

              Comment


              • #8
                Allow me to give you the advice that - according to a recent search of Statalist - I have now given over 100 times to members who identify themselves as new users of Stata.

                I'm sympathetic to you as a new user of Stata - there is quite a lot to absorb.

                When I began using Stata in a serious way, I started, as have others here, by reading my way through the Getting Started with Stata manual relevant to my setup. Chapter 18 then gives suggested further reading, much of which is in the Stata User's Guide, and I worked my way through much of that reading as well. There are a lot of examples to copy and paste into Stata's do-file editor to run yourself, and better yet, to experiment with changing the options to see how the results change.

                All of these manuals are included as PDFs in the Stata installation and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu. The objective in doing the reading was not so much to master Stata - I'm still far from that goal - as to be sure I'd become familiar with a wide variety of important basic techniques, so that when the time came that I needed them, I might recall their existence, if not the full syntax, and know how to find out more about them in the help files and PDF manuals.

                Stata supplies exceptionally good documentation that amply repays the time spent studying it - there's just a lot of it. The path I followed surfaces the things you need to know to get started in a hurry and to work effectively.

                Comment


                • #9
                  Thank you for the help, as well as the advice, I hope you don't if I reach out to you some other time via this thread.?

                  Comment


                  • #10
                    mind* if

                    Comment


                    • #11
                      Unless your question is a continuation of the particular reformatting problem we've been dealing with here, it is usually better to post new questions to a new thread, using a new, more suitable title. That way you'll draw a broader range of members, perhaps people with experience dealing with financial data like yours, who may not be drawn to an older thread on "converting to panel data". And while I'll certainly keep an eye open for your posts, you'll get better advice by casting your net as broadly as you can.

                      Comment


                      • #12
                        Noted: Thank you

                        Comment

                        Working...
                        X