Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed effects regression drops a variable of interest

    Right now I have a standard OLS model with one year of data, and I'm interested in the coefficient for charter schools. My dependent variable is test scores. I currently have Charter schools =1 if charter school and =0 if public school. I control for other variables like demographics, funding, etc. I thought I could make things more interesting with panel data, but I am getting pretty stuck.

    First off, I'm focused on my independent variable for Charter schools and since it's time-invariant, it drops from fixed effects. I think this is an issue because it's my most important independent variable. I've tried a few things:

    encode SchoolName, gen(SchoolName_2)

    xtset SchoolName Year, yearly


    Then I tried these different regressions:

    xtreg Y_dep X_ind i.Year, re(don't think random effects is appropriate, ran the hausman)


    xtreg Y_dep X_ind i.Year, fe


    xtreg Y_dep X_ind i.Year, fe vce(cluster SchoolName_2) and xtreg Y_dep X_ind i.Year, fe vce(robust)


    reg Y_dep X_ind i.Year i.SchoolName_2, vce(cluster SchoolName_2)




    Not sure if running a regular regression and just clustering errors by School ID (SchoolName) would be the best route, or if it even leads to anything that makes sense. Is it possible to run a regression using panel data when my binary independent variable Charter School is my variable of interest? I have data for 4 years, about 1200 schools.

    Does it make sense to try and find a coefficient for each individual school? I noticed a lot of independent variables also become insignificant with my panel data, compared to my single year OLS where most variables are significant. I'm aware there's a few limitations with my model, but I feel like there might be a serious issue with it that I'm not grasping.

    Thanks in advance for your advice.

  • #2
    Well, as you yourself observed, the reason your variable is being dropped is because it is time-invariant within schools. There's no getting around linear algebra: you can't estimate this in a fixed effects model. So whether Mr. Hausman likes it or not, you will have to abandon fixed effects if you want to estimate this particular parameter.

    Probably your best bet is to use a hybrid model. -search xthybrid- will get you the link to install it and read the documentation.

    As for
    I noticed a lot of independent variables also become insignificant with my panel data, compared to my single year OLS where most variables are significant.
    1. Start now banishing the term "statistically significant" from your vocabulary. See the American Statistical Association's new position on this at https://www.tandfonline.com/doi/full...5.2019.1583913, or, for a shorter pep talk on the same topic, see https://www.nature.com/articles/d41586-019-00857-9.
    2. Even if you insist on holding on to the concept of statistical significance, bear in mind that the difference between statistically significant and not statistically significant is, itself, not statistically significant. You need to look at the actual change in the coefficients to make any reasonable judgment about this--the pvalues are not helpful for this purpose.
    3. Putting 1 and 2 aside, when you estimate a fixed-effects model, you are estimating the within-school effects of your predictor variables. With OLS or RE you are estimating a mixture of within and between-school effects. So it is not meaningful to compare the FE coefficients with the coefficients from OLS or RE--they are estimating different things and there is no reason to expect them to be similar, or even have the same sign.

    Comment

    Working...
    X