Hello everyone,
I have a project in which I am trying to understand how the probability of someone becoming an entrepreneur in an industry is related with a series of variables. In order to do so, I am utilizing a fixed effects regression, and have constructed a few models to be able to interpret the results.
One of the variables which I am interested in analyzing is the median age of the industries. I have models that include the median age and a collection of other variables, and models that include the median age and its squared term, and the same collection of other variables.
The code itself is as follows:
Model1:
xtreg change_to_employer age_median log_nemp_median gender numb_firms higher_education i.year high_tech low_tech KIS Other, fe cluster(caem2)
and
Model2:
xtreg change_to_employer c.age_median##c.age_median log_nemp_median gender numb_firms higher_education i.year high_tech low_tech KIS Other , fe cluster(caem2)
Model 1:
Model 2:
The issue I am having interpreting is that the coefficient for age_median in Model1 is not significant, but then the coefficients for both age_median and c.age_median#c.age_median are both significant for Model2. As shown in:
Model1:
Model2:
How is it possible that one variable is not significant by itself, but then becomes significant when regressed together with its quadratic term? Can I then say that the median age of the industries has a significant impact of the probability of transition into entrepreneurship?
Thank you very much,
Rui
I have a project in which I am trying to understand how the probability of someone becoming an entrepreneur in an industry is related with a series of variables. In order to do so, I am utilizing a fixed effects regression, and have constructed a few models to be able to interpret the results.
One of the variables which I am interested in analyzing is the median age of the industries. I have models that include the median age and a collection of other variables, and models that include the median age and its squared term, and the same collection of other variables.
The code itself is as follows:
Model1:
xtreg change_to_employer age_median log_nemp_median gender numb_firms higher_education i.year high_tech low_tech KIS Other, fe cluster(caem2)
and
Model2:
xtreg change_to_employer c.age_median##c.age_median log_nemp_median gender numb_firms higher_education i.year high_tech low_tech KIS Other , fe cluster(caem2)
Model 1:
Code:
xtreg change_to_empregador age_median log_nemp_median gender numb_firms_div1000 higher_education vn_per_employee_median i.year high_tech low_tech KIS Other, fe cluster(caem2)
Model 2:
Code:
xtreg change_to_empregador c.age_median##c.age_median log_nemp_median gender numb_firms_div1000 higher_education vn_per_employee_median i.year high_tech low_tech KIS Other, fe cluster(caem2)
The issue I am having interpreting is that the coefficient for age_median in Model1 is not significant, but then the coefficients for both age_median and c.age_median#c.age_median are both significant for Model2. As shown in:
Model1:
Code:
Fixed-effects (within) regression Number of obs = 889
Group variable: caem2 Number of groups = 77
R-sq: Obs per group:
within = 0.1832 min = 3
between = 0.0859 avg = 11.5
overall = 0.0853 max = 12
F(24,76) = 5.26
corr(u_i, Xb) = -0.2660 Prob > F = 0.0000
(Std. Err. adjusted for 77 clusters in caem2)
------------------------------------------------------------------------------------------
| Robust
Change_to_empregador_f~e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
age_median | -.001125 .0028954 -0.39 0.699 -.0068918 .0046417
log_nemp_median | -.0015375 .0168405 -0.09 0.927 -.0350782 .0320033
gender | .0010117 .0016409 0.62 0.539 -.0022565 .0042798
numb_firms_div1000 | -.0086784 .0050413 -1.72 0.089 -.018719 .0013621
higher_education | .0009512 .0012174 0.78 0.437 -.0014735 .0033758
vn_per_employee_median | -.0005548 .000214 -2.59 0.011 -.000981 -.0001286
Model2:
Code:
Fixed-effects (within) regression Number of obs = 889
Group variable: caem2 Number of groups = 77
R-sq: Obs per group:
within = 0.3109 min = 3
between = 0.1165 avg = 11.5
overall = 0.1408 max = 12
F(27,76) = 7.08
corr(u_i, Xb) = -0.1859 Prob > F = 0.0000
(Std. Err. adjusted for 77 clusters in caem2)
-----------------------------------------------------------------------------------------------------------------------
| Robust
change_to_empregador | Coef. Std. Err. t P>|t| [95% Conf. Interval]
------------------------------------------------------+----------------------------------------------------------------
age_median | .136472 .0677635 2.01 0.048 .0015094 .2714347
|
c.age_median#c.age_median | -.001687 .0008374 -2.01 0.047 -.0033548 -.0000191
|
log_nemp_median | -.0891614 .0408428 -2.18 0.032 -.1705069 -.0078159
gender | -.0019755 .0036136 -0.55 0.586 -.0091726 .0052216
numb_firms_div1000 | -.0235788 .0107627 -2.19 0.032 -.0450145 -.002143
higher_education | .0012232 .0029379 0.42 0.678 -.0046282 .0070745
vn_per_employee_median | -.0013087 .0004205 -3.11 0.003 -.0021462 -.0004713
|
year |
How is it possible that one variable is not significant by itself, but then becomes significant when regressed together with its quadratic term? Can I then say that the median age of the industries has a significant impact of the probability of transition into entrepreneurship?
Thank you very much,
Rui

Comment