Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • cross section regression

    Dear Statalist members,

    I would like to estimate an equation using annual cross-sectional regressions for the period 2010-2015 to obtain coefficients β0 , β1 , β2 , β3 for each year.
    To do so, I insert the following commands in STATA 14.2

    forvalue y = 2010(1)2015{
    display `y'
    reg RN D R RTAILLE RMTB REND DR DRTAILLE DRMTB DREND TAILLE MTB END DTAILLE DMTB DEND if `y'== year,
    gen beta_0 = _b(DR) if `y'== year,
    gen beta_1= _b(DRTAILLE) if `y'== year,
    gen beta_2= _b(DRMTB) if `y'== year,
    gen beta_3= _b(DREND) if `y'== year,

    But it seems that I have an error somewhere. Could you please help me to correct this command that permit to estimate the cross section regression and obtain the coefficients βit.


    Best regards

  • #2
    Never say "I have an error somewhere". Always show the exact error message.

    Second time around the loop, beta_0 already exists and you can't generate it because it is not a new variable.

    Your loop can be improved like this:

    Code:
    quietly foreach v in  DR DRTAILLE DRMTB DREND  { 
           gen beta_`v' = . 
    } 
    
    forval y = 2010/'2015 { 
            regress RN D R RTAILLE RMTB REND DR DRTAILLE DRMTB DREND TAILLE MTB END DTAILLE DMTB DEND if `y'== year,
            quietly foreach v in  DR DRTAILLE DRMTB DREND  { 
                  replace beta_`v' = ._b[`v'] if year == `y' 
            }
    }
    but

    0. Without a data example, no one but you can test this.

    1. That's not bullet-proof. A regression could fail if there aren't enough observations with non-missing values on all predictors.

    2. You need never do this. See the help for statsby, or search the forum for mentions of regressby, rangestat, asreg, etc. from SSC.


    Comment


    • #3
      Thank you dear Professor, your help has been so beneficial for me. Sorry for adding a post for an old relating one. Actually, after cleaning my data I run this cross sectional regression. however it seems that in one year Stata has removed the variables DR DRMTB DRENDT that I already need its coefficients to calculate another variable. Here is the outcome that I have :

      note: DR omitted because of collinearity
      note: DRMTB omitted because of collinearity
      note: DREND omitted because of collinearity

      Source | SS df MS Number of obs = 235
      -------------+---------------------------------- F(12, 222) = 2.98
      Model | .189757838 12 .015813153 Prob > F = 0.0007
      Residual | 1.1779301 222 .005305991 R-squared = 0.1387
      -------------+---------------------------------- Adj R-squared = 0.0922
      Total | 1.36768794 234 .00584482 Root MSE = .07284

      ------------------------------------------------------------------------------
      RN_w | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      D | -.1796934 .4747194 -0.38 0.705 -1.115226 .7558397
      R_w | .0936229 .3297436 0.28 0.777 -.5562053 .7434511
      RTAILLE | -.0051216 .0241262 -0.21 0.832 -.0526672 .042424
      RMTB | -.0164328 .0117224 -1.40 0.162 -.0395343 .0066686
      REND | .2506437 .1888433 1.33 0.186 -.1215111 .6227985
      DR | 0 (omitted)
      DRTAILLE | .0048271 .0167258 0.29 0.773 -.0281346 .0377889
      DRMTB | 0 (omitted)
      DREND | 0 (omitted)
      TAILLE_w | -.0071607 .0300116 -0.24 0.812 -.0663047 .0519833
      MTB_w | .0018271 .0197976 0.09 0.927 -.0371882 .0408424
      END_w | .17317 .5799909 0.30 0.766 -.9698223 1.316162
      DTAILLE | .0137247 .0319462 0.43 0.668 -.0492319 .0766812
      DMTB | -.0074506 .0207255 -0.36 0.720 -.0482946 .0333934
      DEND | -.1092015 .5912092 -0.18 0.854 -1.274302 1.055899
      _cons | .1558126 .4424876 0.35 0.725 -.716201 1.027826
      ------------------------------------------------------------------------------


      Could you please help me to find a solution to this issue?

      Comment


      • #4
        Joelle:
        please use CODE delimiters (# toggle of the Advanced editor) to share what you typed (as you did not) and what Stata gave you back (as you did) (see also the FAQ on this and other posting-related topics). Thanks.
        As extreme multicollinetarity is an issue envolving more that one variable, there's no easy fix but revising your regression model specification (that is, the right-hand side of your regression equation).
        As far as I can see from the outcome table, there's no statistically significant coefficient. This is no evil per se, but sounds really strange. Hence, I recommend that you check you data and regression specification.
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #5
          Thank you for your reply. Excuse me for the mispresenting of my post. I would be more diligent in posting next time. Actually, this extreme multicollinetarity is due to the presence of the variable D which is binary : =1 if R<0 and 0 otherwise. Since, it's very possible that D*R and R; D*R*MTB and R*MTB ; D*R*END and R*END would be highly correlated. Could I in this case use the coefficients of the variables which are highly correlated which those that I need (D*R.; D*R*MTB ; D*R*END). It's noting that I couldn't make a modification in the specification of the model because it's a valid measure used in research.

          Comment


          • #6
            Joelle:
            your intuition makes sense but cannot fix the issue.
            My advice is to reduce the number of interactions, since most of them lack statistical significance (as a general rule, too many intearctions increase the difficulty of explaining the results of your regression) and go for a more parsimonious model (the literature in your research field can support yopu in this respect).
            Kind regards,
            Carlo
            (Stata 18.0 SE)

            Comment

            Working...
            X