cross section regression

Joelle Rosslyn

Join Date: Aug 2018

Posts: 4
#1

cross section regression

01 Aug 2018, 02:49

Dear Statalist members,

I would like to estimate an equation using annual cross-sectional regressions for the period 2010-2015 to obtain coefficients β₀ , β₁, β₂ , β₃ for each year.
To do so, I insert the following commands in STATA 14.2

forvalue y = 2010(1)2015{
display `y'
reg RN D R RTAILLE RMTB REND DR DRTAILLE DRMTB DREND TAILLE MTB END DTAILLE DMTB DEND if `y'== year,
gen beta_0 = _b(DR) if `y'== year,
gen beta_1= _b(DRTAILLE) if `y'== year,
gen beta_2= _b(DRMTB) if `y'== year,
gen beta_3= _b(DREND) if `y'== year,

But it seems that I have an error somewhere. Could you please help me to correct this command that permit to estimate the cross section regression and obtain the coefficients βit.

Best regards
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35811
#2

01 Aug 2018, 03:04

Never say "I have an error somewhere". Always show the exact error message.

Second time around the loop, beta_0 already exists and you can't generate it because it is not a new variable.

Your loop can be improved like this:

Code:

quietly foreach v in DR DRTAILLE DRMTB DREND { gen beta_`v' = . } forval y = 2010/'2015 { regress RN D R RTAILLE RMTB REND DR DRTAILLE DRMTB DREND TAILLE MTB END DTAILLE DMTB DEND if `y'== year, quietly foreach v in DR DRTAILLE DRMTB DREND { replace beta_`v' = ._b[`v'] if year == `y' } }

but

0. Without a data example, no one but you can test this.

1. That's not bullet-proof. A regression could fail if there aren't enough observations with non-missing values on all predictors.

2. You need never do this. See the help for statsby, or search the forum for mentions of regressby, rangestat, asreg, etc. from SSC.
Comment
Joelle Rosslyn

Join Date: Aug 2018

Posts: 4
#3

14 Nov 2018, 05:57

Thank you dear Professor, your help has been so beneficial for me. Sorry for adding a post for an old relating one. Actually, after cleaning my data I run this cross sectional regression. however it seems that in one year Stata has removed the variables DR DRMTB DRENDT that I already need its coefficients to calculate another variable. Here is the outcome that I have :

note: DR omitted because of collinearity
note: DRMTB omitted because of collinearity
note: DREND omitted because of collinearity

Source | SS df MS Number of obs = 235
-------------+---------------------------------- F(12, 222) = 2.98
Model | .189757838 12 .015813153 Prob > F = 0.0007
Residual | 1.1779301 222 .005305991 R-squared = 0.1387
-------------+---------------------------------- Adj R-squared = 0.0922
Total | 1.36768794 234 .00584482 Root MSE = .07284

------------------------------------------------------------------------------
RN_w | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
D | -.1796934 .4747194 -0.38 0.705 -1.115226 .7558397
R_w | .0936229 .3297436 0.28 0.777 -.5562053 .7434511
RTAILLE | -.0051216 .0241262 -0.21 0.832 -.0526672 .042424
RMTB | -.0164328 .0117224 -1.40 0.162 -.0395343 .0066686
REND | .2506437 .1888433 1.33 0.186 -.1215111 .6227985
DR | 0 (omitted)
DRTAILLE | .0048271 .0167258 0.29 0.773 -.0281346 .0377889
DRMTB | 0 (omitted)
DREND | 0 (omitted)
TAILLE_w | -.0071607 .0300116 -0.24 0.812 -.0663047 .0519833
MTB_w | .0018271 .0197976 0.09 0.927 -.0371882 .0408424
END_w | .17317 .5799909 0.30 0.766 -.9698223 1.316162
DTAILLE | .0137247 .0319462 0.43 0.668 -.0492319 .0766812
DMTB | -.0074506 .0207255 -0.36 0.720 -.0482946 .0333934
DEND | -.1092015 .5912092 -0.18 0.854 -1.274302 1.055899
_cons | .1558126 .4424876 0.35 0.725 -.716201 1.027826
------------------------------------------------------------------------------

Could you please help me to find a solution to this issue?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17749
#4

14 Nov 2018, 06:30

Joelle:
please use CODE delimiters (# toggle of the Advanced editor) to share what you typed (as you did not) and what Stata gave you back (as you did) (see also the FAQ on this and other posting-related topics). Thanks.
As extreme multicollinetarity is an issue envolving more that one variable, there's no easy fix but revising your regression model specification (that is, the right-hand side of your regression equation).
As far as I can see from the outcome table, there's no statistically significant coefficient. This is no evil per se, but sounds really strange. Hence, I recommend that you check you data and regression specification.

Kind regards,
Carlo
(Stata 19.0)
Comment
Joelle Rosslyn

Join Date: Aug 2018

Posts: 4
#5

14 Nov 2018, 13:09

Thank you for your reply. Excuse me for the mispresenting of my post. I would be more diligent in posting next time. Actually, this extreme multicollinetarity is due to the presence of the variable D which is binary : =1 if R<0 and 0 otherwise. Since, it's very possible that D*R and R; D*R*MTB and R*MTB ; D*R*END and R*END would be highly correlated. Could I in this case use the coefficients of the variables which are highly correlated which those that I need (D*R.; D*R*MTB ; D*R*END). It's noting that I couldn't make a modification in the specification of the model because it's a valid measure used in research.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17749
#6

15 Nov 2018, 01:40

Joelle:
your intuition makes sense but cannot fix the issue.
My advice is to reduce the number of interactions, since most of them lack statistical significance (as a general rule, too many intearctions increase the difficulty of explaining the results of your regression) and go for a more parsimonious model (the literature in your research field can support yopu in this respect).

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

cross section regression

Comment

Comment

Comment

Comment

Comment