Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Storing estimated coefficients in a loop

    Hello everyone,


    I am working on a dataset containing daily stock returns of more than 3000 firms and 4 factors of Carhart between 1995 and 2021. Here is an example of my dataset:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float date double(MktRF SMB HML WML) long firm float(return firm_number)
    22077 -5.91  -.83 -3.06  3.51 1  -.000952381 1
    22078  1.29   .36  1.86 -1.24 1   .020019066 1
    22081  1.09  1.42  -.41    .8 1    .05560748 1
    22082  1.86   .63   .48 -1.05 1   .013722886 1
    22083   -.4  -.91 -1.98  1.94 1  -.036681224 1
    22084   .19    .1  -.56   .37 1  -.033998188 1
    22085  -.45   .09  -.64   .97 1     .0950258 1
    22088   .71   .83 -1.44  1.12 1   -.07285194 1
    22089   .42   .14  -.56   .03 1   -.02796395 1
    22090 -2.61   -.5  -1.3  1.72 1   .010936757 1
    22091  1.12   .26   .49   .36 1     -.065381 1
    22092 -2.44   .18 -1.42   .66 1    .01912431 1
    22095  1.51  1.22  1.77  -.92 1  -.011358025 1
    22096  1.58   .07     0   .67 1    .04145854 1
    22097   .41 -1.24 -2.51  1.27 1    .01918465 1
    22098    .5  -.01  -.09  -.28 1   .031058824 1
    22102  1.65  -.61   .37  -.16 1  .0027384756 1
    22103 -1.03  -.51 -1.53  1.43 1    .03186163 1
    22104   .91   .31  -.46  1.06 1  -.033965595 1
    22105  -.53   -.9 -2.59  1.91 1   .011872146 1
    22106  1.11  -.13  3.03 -1.18 1    .03790614 1
    22109  -1.2  -.78  1.99 -1.47 1    .04478261 1
    22110  1.35   .36  -.34   .15 1    .06325427 1
    22111  1.14   2.4  1.28  -1.7 1    .04931507 1
    22112  -.37  -.37   .79  -.25 1   .004848937 1
    22113    .3   .18 -1.41  1.12 1  -.034149963 1
    22116  1.01  -.43 -2.38  2.21 1  -.001537279 1
    22117   .16   .83  3.23 -1.72 1   .027713627 1
    22118   .49  -.58  -.34   .41 1   -.05243446 1
    22119  -1.2   .72  1.92 -1.12 1  -.005533597 1
    22120  -.75  -.79   .46  -.26 1   .007551669 1
    22123   .88   .78 -1.94  1.76 1            . 1
    21804   .22   -.3  -.17   .44 2  -.009099526 2
    21805  -.01     0   .75  -.97 2     .0560551 2
    21808  -.23   .93    .5  -.82 2    .02862319 2
    21809   .17   -.6 -1.22  1.57 2    .09193378 2
    21810  -.05  -.74   .06   .71 2   -.01419355 2
    21811  -.05  -.43  -.45   .42 2   -.01014398 2
    21812  -.48    .3   .16   .13 2  -.002809917 2
    21815  -.02  -.18   .51   .45 2   -.09995028 2
    21816 -1.01  -.75   .02  1.08 2   -.02836096 2
    21817   .69   .36   .53   -.5 2    -.0525019 2
    21818  -.41  -.97   .12   .56 2    .02080416 2
    21819  -.62  -.36   .88  -.65 2   -.01234568 2
    21822    .5  -.15  -.48  -.04 2   -.06686508 2
    21823 -1.31  -.47  -.61   .48 2  -.020412503 2
    21824 -1.73   .87  -.35   .53 2     .0444975 2
    21825    .8   -.2  -.91   .39 2      .069202 2
    21826  1.39  -.49  -.09   .55 2   -.02118562 2
    21829  -.41   .16  -.02  -.01 2  -.016084194 2
    21830 -1.61  -.14  -.12    .4 2   .009283552 2
    21831   .92  -.53  -.12   .34 2   -.03139372 2
    21832   .59   -.4   .27  -.43 2   .017753921 2
    21833  1.23   .49   .19 -1.01 2   .019472616 2
    21836  -.18  -.33  -.06   .05 2   -.03183446 2
    21837  1.03   .26  -.23  -.18 2    .05076038 2
    21838  -.24   .39   .23  -.26 2   .014864072 2
    21839   .37   .81  -.28  -.07 2   .011755637 2
    21840  -.49  -.38    .9  -.03 2   .016380953 2
    21843   .71   .18   .33  -.61 2     .0238006 2
    21844  -.34   .21   .88   -.8 2    .07321984 2
    21845   .25  -.09   .25  -.53 2   -.03087157 2
    21846   .26  -.26  -.91   .77 2   -.07602957 2
    21847    .5   .28   .08  -.25 2    .06761905 2
    21850   .62   .27  -.25  -.11 2  -.023015166 2
    21851  -.13   .28   .46   .03 2    .05368882 2
    21852   .27  -.39 -1.17   .79 2   .005199307 2
    21853  -.38  -.27  -.31   .36 2   .004310345 2
    21854  1.08   .45   .79  -.61 2    -.0072103 2
    21857    .4   .05  1.39 -1.45 2   -.04323016 2
    21858  -.03   .24   .49  -.85 2   -.02295319 2
    21859  -.05   -.7   .18   .18 2    .02256752 2
    21860   .38  -.12   .53  -.65 2  .0079594795 2
    21861   .31    .1  -.38     0 2     .0821967 2
    21864  -.19  -.01  -.21   .57 2  -.010779436 2
    21865   .16  -.11  -.14   .28 2     .0217938 2
    21866   .01  -.22  -.82  1.03 2    .04183757 2
    21867   .07  -.12  -.26   .42 2    .06850394 2
    21868   .74  -.21  -.35   -.2 2   -.09594695 2
    21871   .02  -.36  -.51   .71 2    .05004891 2
    21872   .02   .59  -.96   .12 2  -.013351964 2
    21873  -.33   .02  -.45   .11 2  -.027380016 2
    21874  -.14  -.32   .12  -.46 2   .033166155 2
    21875   .24   .05   .24  -.31 2   .033510804 2
    21878   .92  1.31   -.4  -.12 2   -.02969697 2
    21879   .19   .05  -.89   .58 2  -.023110555 2
    21880   .44   .23  -.05  -.12 2   .037244245 2
    21882  -.42  -.04  -.33   .04 2 -.0021574972 2
    21885  -.87  -.25   .52  -.22 2   .003243243 2
    21886  -.66    .6  -.83   .39 2  -.021705665 2
    21887    .6   .11   .25  -.09 2  -.014791503 2
    21888   .13  -.24   .48   .18 2   .012777512 2
    21889   .91   .24    .4  -.17 2 -.0033117805 2
    21892  -.33   .36   .12  -.16 2    .04351266 2
    21893  -.08   .29  -.11   .09 2  -.011978772 2
    21894   .28  -.08    .1   .15 2   -.05555556 2
    21895    .9  -.11  1.16  -.68 2  -.009262268 2
    21896  -.03   -.3  -.54   .36 2    .01672954 2
    21899   .74   .07  -.04   .05 2   .000483949 2
    21900    .1   .42   .68  -.15 2    .12399226 2
    end
    format %td date
    label values firm firm
    label def firm 1 "1-800-Flowers.Com Inc", modify
    label def firm 2 "10X Genomics Inc", modify
    Please note that you are only seeing some observations of only 2 firms.

    I want to store the estimated coefficients each time when I regress a firm's returns against the 4 factors of Carhart and I tried the following code.
    Code:
    gen b_MktRF = .
    gen b_SMB = .
    gen b_HML = .
    gen b_WML = .
    
    * Loop through each firm to regress the firm's stock reurns againts 4 factors
    foreach y of var firm {
        quietly regress return MktRF SMB HML WML if firm == `y'
        replace b_MktRF =_b[MktRF]   if firm== `y'
        replace b_SMB=_b[SMB] if firm == `y'
        replace b_HML=_b[HML] if firm == `y'
        replace b_WML=_b[WML] if firm == `y'
    }
    Normally, I should obtain the respective estimated coefficients of each firm while looping through time-series of firms' returns.That is to say the values of b_MktRF-b_WML should be different for each firm. However, when I run my code, the variables b_MktRF -b_WML are filled with the same values .

    Could someone identify the problems in my code and help me find the solution please? Thank you in advance for your time and generous help!


  • #2
    Your problem is that you believe that
    Code:
    foreach y of var firm {
    will loop over the values that the variable firm takes. This is not the case, see the output of help foreach to better understand how foreach works. In your case, the loop was run exactly one time substituting "firm" for the local macro `y' so you were running commands like
    Code:
    quietly regress return MktRF SMB HML WML if firm == firm
    Perhaps the following will accomplish what you want.
    Code:
    gen b_MktRF = .
    gen b_SMB = .
    gen b_HML = .
    gen b_WML = .
    
    * Loop through each firm to regress the firm's stock reurns againts 4 factors
    levelsof firm, local(firmlist)
    foreach y of local firmlist {
        quietly regress return MktRF SMB HML WML if firm == `y'
        replace b_MktRF =_b[MktRF]   if firm== `y'
        replace b_SMB=_b[SMB] if firm == `y'
        replace b_HML=_b[HML] if firm == `y'
        replace b_WML=_b[WML] if firm == `y'
    }
    Crossed with #3, Andrew and I think alike and apparently express ourselves similarly!
    Last edited by William Lisowski; 21 Jun 2021, 07:01.

    Comment


    • #3
      foreach y of var firm {
      You cannot refer to levels of a variable in this way. Stata reads this command as

      Code:
      foreach y of varlist firm{
      and therefore the only element in the local is the variable "firm". You get the same values because the evaluation is

      Code:
      if firm== firm
      which applies to all observations in the dataset. Anyway, what you need is the levelsof command.

      Code:
      levelsof firm, local(firms)
      foreach y of local firms{
          ...
      Crossed with #2
      Last edited by Andrew Musau; 21 Jun 2021, 06:50.

      Comment


      • #4
        Dear both,

        Thank you so much for your help. The code worked perfectly! I wonder if it's possible to generate only once estimated coefficients for each firm since the current code which copies the estimated coefficients of each regression to all of observations of each firm takes much time to execute.

        Thank you again and I look forward to your advice.

        Comment


        • #5
          Little time would be saved by reducing the number of copies of the estimated coefficients that are saved. The overwhelming bulk of the time in the loop is taken by running the regression command, not by saving the coefficients. Remember, in your incorrect code only a single regression was run across all firms; now each firm has its own regression, which is why it takes much time to run.

          Comment


          • #6
            I agree with William. You could try runby from SSC by Robert Picard and Clyde Schechter. Its code is optimized for this kind of task.

            Code:
            ssc install runby, replace
            Code:
            cap program drop mycoef
            program mycoef
            regress return MktRF SMB HML WML
            gen b_MktRF =_b[MktRF]
            gen b_SMB=_b[SMB]
            gen b_HML=_b[HML]
            gen b_WML=_b[WML]
            end
            
            runby mycoef, by(firm)
            Last edited by Andrew Musau; 22 Jun 2021, 05:26.

            Comment


            • #7
              Thank you for all your comments. It's the first time I manipulate such massive dataset, and that's why I'm surprised by the time that the code takes. I'll try the code provided by Andrew on my sub-samples.

              Comment


              • #8
                Dear William and Andrew, I come back to you as I encouter a problem while applying the following code on my sub-samples:
                Code:
                gen b1_MktRF = .
                gen b1_SMB = .
                gen b1_HML = .
                gen b1_WML = .
                gen residual1 = .
                
                * Loop through each firm to regress the firm's stock reurns againts 4 factors
                levelsof firm, local(firmlist)
                foreach y of local firmlist {
                    quietly regress return MktRF SMB HML WML if firm == `y' & year(date)<=2004
                    replace b1_MktRF =_b[MktRF]   if firm== `y' & year(date)<=2004
                    replace b1_SMB=_b[SMB] if firm == `y' & year(date)<=2004
                    replace b1_HML=_b[HML] if firm == `y' & year(date)<=2004
                    replace b1_WML=_b[WML] if firm == `y' & year(date)<=2004
                    predict temp, residuals
                    replace residual1 = temp  if firm == `y' & year(date)<=2004
                    drop temp
                }
                I want to run regressions of each firm for observations before 2004 and save corresponding estimated coefficients and residuals as I did for the full sample. But when I run my code, the program stops at the second firm and I get a message "no observations". Do you have any ideas why this happen and how I can fix it?

                Thank you in advance.

                Comment


                • #9
                  Some of your regressions are failing due to limited data. You can skip these regressions using the capture command which will suppress the error messages. See -help capture-

                  Code:
                  gen b1_MktRF = .
                  gen b1_SMB = .
                  gen b1_HML = .
                  gen b1_WML = .
                  gen residual1 = .
                  
                  * Loop through each firm to regress the firm's stock reurns againts 4 factors
                  levelsof firm, local(firmlist)
                  foreach y of local firmlist {
                      capture{
                          quietly regress return MktRF SMB HML WML if firm == `y' & year(date)<=2004
                          replace b1_MktRF =_b[MktRF]   if firm== `y' & year(date)<=2004
                          replace b1_SMB=_b[SMB] if firm == `y' & year(date)<=2004
                          replace b1_HML=_b[HML] if firm == `y' & year(date)<=2004
                          replace b1_WML=_b[WML] if firm == `y' & year(date)<=2004
                          predict temp, residuals
                          replace residual1 = temp  if firm == `y' & year(date)<=2004
                          drop temp
                      }
                  }

                  Comment


                  • #10
                    Thank you so much !! I'll look into that!

                    Comment

                    Working...
                    X