Differences between xtreg, reghdfe, and reg using complex survey data?

Max Lubell

Join Date: Jul 2023

Posts: 2
#1

Differences between xtreg, reghdfe, and reg using complex survey data?

26 Jul 2023, 08:24

Apologies if this is more of a stats question, but hoping to find some support for differences in standard errors when specifying a fixed effects model with complex survey data.

I am running an individual-level fixed effects regression model using data from one of the NCES longitudinal cohort surveys (HSLS:09). The standard errors need to account for the random sampling of students clustered within 944 schools. The data documentation provides instructions on survey setting the data using Taylor series linearization with the code below. I'll note the PSU variable is three levels, the STRAT_ID variable is 450 levels, and the data is mi set.

HTML Code:

mi svyset, clear() mi svyset PSU [pweight = Weight_Variable], strata(STRAT_ID) vce(linear) singleunit(centered)

As has been discussed on the forum, the svy command does not support xtreg or reghdfe. I have been using the advice posted here as a workaround. I tried to specify the model using the three strategies listed below to see differences in the output.

HTML Code:

*Strategy 1: reghdfe mi estimate, post cmdok: reghdfe DV IV1 IV2 IV3 [pweight = Weight_Variable], absorb(STU_ID) vce(cluster STRAT_ID) *Strategy 2: xtreg mi xtset STU_ID Year mi estimate, post: xtreg DV IV1 IV2 IV3 [pweight = Weight_Variable], fe vce(cluster STRAT_ID) *Strategy 3: reg+absorb mi estimate, post: svy: reg: DV IV1 IV2 IV3, absorb(STU_ID)

All three strategies produce practically the same coefficients (give or take 0.0001). The issue is that I am getting drastically different standard errors. The output using xtreg and reghdfe provide practically the same standard errors (give or take about 0.001), but reg+absorb gives standard errors which are way off from strategies 1 and 2 (about 0.05 higher). The F-test also goes from 0 in the first two models to 0.6 with reg+absorb. I accept that the outputs are not going to be exactly the same, but this feels like a pretty drastic difference. It's troubling to me that the strategy that most correctly specifies the survey design is giving me such distinct output.

Any thoughts on what is accounting for the differences in the standard errors using these three approaches? Is it something inherent about how xtreg/reghdfe function compared to reg+absorb()? Does it come down to specifying the Taylor series linearization vs. the cluster robust specifications?

Each output is also giving me a different number of observations. I figured this was because the three commands treat missing values in the dependent variables differently, but mentioning it in case I am missing a blatantly obvious clue as to the differences between these functions.

This is my first time posting to the forum! Sorry if I messed up any norms (still learning!).
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10281
#2

26 Jul 2023, 09:36

As was stated in the linked thread, clustering on the PSU variable and using -pweights()- is equivalent to -svy- only in the absence of stratification. This is not the case for you as you have stratification. In this case, the only viable estimator is svy: regress.
Comment
Max Lubell

Join Date: Jul 2023

Posts: 2
#3

26 Jul 2023, 15:27

Thanks, Andrew. Aside from the issues with specifying standard error / stratification, is it safe to assume that reg+absorb() is doing practically the same thing as xtreg/reghdfe?
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10281
#4

26 Jul 2023, 15:58

For regress with -absorb()-, refer to

Code:

help areg

There are some differences in terms of focus and what kinds of models each of these commands can handle, do read the documentation. But if you have panel data, -xtreg,fe- and absorbing the panel identifier in each of the other two estimators yields equivalent results. So in that sense, they do the same thing.
Comment

Announcement

Differences between xtreg, reghdfe, and reg using complex survey data?

Comment

Comment

Comment