I know one method can be:
But the problem is in my data set, I have a lot of id. And it kills my Stata.
Code:
regress y x1 x2 i.id, vce(robust)
. use "https://www.stata-press.com/data/r16/nlswork.dta" (National Longitudinal Survey. Young Women 14-26 years of age in 1968) . areg ln_wage age, absorb(idcode) vce(cluster idcode) Linear regression, absorbing indicators Number of obs = 28,510 Absorbed variable: idcode No. of categories = 4,710 F( 1, 4709) = 738.02 Prob > F = 0.0000 R-squared = 0.6636 Adj R-squared = 0.5970 Root MSE = 0.3035 (Std. Err. adjusted for 4,710 clusters in idcode) ------------------------------------------------------------------------------ | Robust ln_wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0181349 .0006675 27.17 0.000 .0168262 .0194436 _cons | 1.148214 .0193889 59.22 0.000 1.110202 1.186225 ------------------------------------------------------------------------------ . xtset idcode year panel variable: idcode (unbalanced) time variable: year, 68 to 88, but with gaps delta: 1 unit . xtreg ln_wage age, fe vce(cluster idcode) Fixed-effects (within) regression Number of obs = 28,510 Group variable: idcode Number of groups = 4,710 R-sq: Obs per group: within = 0.1026 min = 1 between = 0.0877 avg = 6.1 overall = 0.0774 max = 15 F(1,4709) = 884.05 corr(u_i, Xb) = 0.0314 Prob > F = 0.0000 (Std. Err. adjusted for 4,710 clusters in idcode) ------------------------------------------------------------------------------ | Robust ln_wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0181349 .0006099 29.73 0.000 .0169392 .0193306 _cons | 1.148214 .0177153 64.81 0.000 1.113483 1.182944 -------------+---------------------------------------------------------------- sigma_u | .40635023 sigma_e | .30349389 rho | .64192015 (fraction of variance due to u_i) ------------------------------------------------------------------------------ . reghdfe ln_wage age, absorb(idcode) vce(cluster idcode) (dropped 551 singleton observations) (converged in 1 iterations) HDFE Linear regression Number of obs = 27,959 Absorbing 1 HDFE group F( 1, 4158) = 884.06 Statistics robust to heteroskedasticity Prob > F = 0.0000 R-squared = 0.6540 Adj R-squared = 0.5936 Within R-sq. = 0.1026 Number of clusters (idcode) = 4,159 Root MSE = 0.3035 (Std. Err. adjusted for 4,159 clusters in idcode) ------------------------------------------------------------------------------ | Robust ln_wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0181349 .0006099 29.73 0.000 .0169391 .0193307 ------------------------------------------------------------------------------ Absorbed degrees of freedom: ---------------------------------------------------------------+ Absorbed FE | Num. Coefs. = Categories - Redundant | -------------+-------------------------------------------------| idcode | 0 4159 4159 * | ---------------------------------------------------------------+ * = fixed effect nested within cluster; treated as redundant for DoF computation
. use "https://www.stata-press.com/data/r16/nlswork.dta" (National Longitudinal Survey. Young Women 14-26 years of age in 1968) . . regress ln_wage age, absorb(idcode) vce(cluster idcode) Linear regression, absorbing indicators Number of obs = 28,510 F(0, 4709) = . Prob > F = . R-squared = 0.6636 Adj R-squared = 0.5970 Root MSE = .30349 (Std. Err. adjusted for 4,710 clusters in idcode) ------------------------------------------------------------------------------ | Robust ln_wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0181349 .0006675 27.17 0.000 .0168262 .0194436 _cons | 1.148214 .0193889 59.22 0.000 1.110202 1.186225 ------------------------------------------------------------------------------
Comment