Testing for autocorrelation in panel without time variable (FE/FD models)

Ahmad Ghaemi

Join Date: Jul 2020

Posts: 7
#1

Testing for autocorrelation in panel without time variable (FE/FD models)

07 Jul 2020, 06:59

Hi everyone,

I have cross-sectional data with 3 observations per student for 3 different subjects, and was planning to estimate some teacher effects with an FD/FE model.

It would be ideal if every student had three teachers, one for each subject. Unfortunately, only about 27% do and the other 83% of the 2359 students have two teachers - with one of them teaching 2 subjects.

I am guessing there will be serial correlation if I use a xtreg fe model, since unobserved teacher effects will be almost the same for all those pairs of observations with the same teacher on 2 different subjects. So perhaps a customized FD model is a better choice. Not only that, I also think I would have to weigh the results with iweights if I use FE, with those pairs of same-teacher-observations weighing half, since their demeaned teacher values are essentially the same teacher effect being measured. Since I'm unsure about how to go about with this or if it is even a good solution, FD seems like a better choice (since it would eliminate the teacher effects completely between the same teacher, and only measure the subject effects in those cases).

Still, since xtreg is so convenient to use, I thought I would convince myself first by doing a test for serial correlation and heteroskedasticity. Since my data is actually cross-sectional, or a panel without a time variable, I don't know where to start on what tests to use. What would be appropriate to look at in this case? And am I even on the right track here or am i missing something important?

Code:

IDSUBJ IDSTUD IDTEACH NTEACH IDCLASS 1 20110 202 3 201 4 20111 201 3 201 2 20111 203 3 201 1 20111 202 3 201 1 20112 202 3 201 4 20112 201 3 201 2 20112 203 3 201 2 30102 302 2 301 4 30102 301 2 301 1 30102 301 2 301 2 30103 302 2 301 1 30103 301 2 301 4 30103 301 2 301 4 30104 301 2 301 1 30104 301 2 301

Since I didn't know about crossposting last time; I did post a similar but not identical question here on stackexchange.
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

08 Jul 2020, 12:53

I don't know what answer you got, but I suspect you are making this harder than it needs to be.

You don't say at what level you are doing your fixed effects. I would have thought the fixed effects would be at the student level to take out differences in ability. Estimating a model that attempts to look at how teacher ability influences student performance when you take out stable effects of teacher ability seems odd.

You can use robust standard errors with xtreg. Many people do this automatically. While you can use xtgls to model much more complicated error structures, it does not include fixed effects or random effects so it is estimating a very different model than xtreg. I don't think a conventional tests for serial correlation would be correct since they usually assume a particular direction of time which you don't have.

I'm not sure that you need to weight for a teacher teaching more than one class to a student. Each class is getting a legitimate error term.
Comment
Ahmad Ghaemi

Join Date: Jul 2020

Posts: 7
#3

08 Jul 2020, 14:12

Hi,

Thanks a lot for the answer. The fixed effects model is on student level, yes.

I could use robust standard errors, but I guess my question would also be how to know whether that is better than first differences or not, so that I could argue for it. Would FD give more efficient estimators?

The reason for weighting would be that otherwise there will be two observations for each teacher with two subjects as opposed to one observation for the single-subject teachers, so their traits would weigh more heavily than the latters' in that case. Wouldn't that affect the inference? Unless their characteristics have roughly the same distribution as the single subject ones maybe, but I'm unsure if I can make that assumption with this sample size. What is your take on that?

Thanks again for the reply.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2169
#4

08 Jul 2020, 15:04

I don't think I've every seen FD used when there is no time ordering to the data, and I'm not sure why one would. The idea of "serial correlation" is not well defined when you have multiple test outcomes for each student. You have nice variation in teacher assignment by subject, so the most convincing method is using student fixed effects and including the teacher dummies. As Phil mentioned, cluster your standard errors.

Code:

xtset studentid xtreg score i.teacher cource_controls, fe vce(cluster studentid)

JW
Comment
Ahmad Ghaemi

Join Date: Jul 2020

Posts: 7
#5

08 Jul 2020, 16:49

Thank you both for taking the time. I will follow your advice. But bear with me while I'm trying to convince myself on this as an undergrad.

I'm using dummies for likert-style survey answers for the different characteristics of the teachers, except for a few continuous ones like 'years of teaching experience'. My worries were just centered on the double-subject teachers. Because for a majority of the students in the dataset I'll have two out of three observations that are identical in every independent variable, with only one indep variable - test subject dummies - differing. The fixed effects will treat these as two completely different observations for all those independent variables and the subtracted mean in the demeaning will lean toward the double-subject teachers.Won't that affect the inference?

If yes, my thought was that this would be avoided with FD since it essentially eliminates one of the observations.

Kind regards,
Ahmad
Comment

Announcement

Testing for autocorrelation in panel without time variable (FE/FD models)

Comment

Comment

Comment

Comment