Hi everyone,
I am a master's student writing my thesis on variation in Danish municipalities' utilization of solar panel potential on agricultural land (2018–2025). I have read the 2019 thread on OLS vs. Tobit for solar energy installation data and found it very helpful, but my setup differs in some important ways and I would appreciate further input.
My panel covers 73 municipalities over 8 years (584 observations). The dependent variable is a utilization rate (installed MW / potential based on 1% of agricultural area). Of 584 observations, 313 are zero, and 25 municipalities have zero utilization across all 8 years. The variable is left-censored at zero by construction — utilization cannot be negative.
I have tested several estimators:
- Pooled OLS with clustered standard errors: several significant results across covariates
- Pooled Tobit (ll=0) with clustered standard errors: similar results, consistent direction and magnitude
- Poisson FE (xtpoisson, fe): drops 25 municipalities with all-zero outcomes
- PPML with absorbed fixed effects (ppmlhdfe): drops 194 observations, leaving only 47 of 73 municipalities. Results almost entirely insignificant.
An additional complication is that many of my key independent variables are time-stable or near time-stable: local political party affiliation, DK2020 climate plan membership, socioeconomic index. My rho is around 0.70-0.76 (depending on specification), meaning most variation is between municipalities rather than within. Fixed effects — whether linear or Poisson — absorb most of what I am trying to explain. The F-test confirms significant municipal heterogeneity (F = 3.51, p = 0.0000), and the Hausman test favors FE over RE (chi2 = 44.42, p = 0.0008), but FE is substantively problematic given the time-stable covariates.
My research question concerns what explains variation in the utilization rate across municipalities — so between-variation is central to the analysis. Losing a third of the sample in Poisson FE removes precisely the variation I need.
I am not attached to any particular model. I simply want to use whatever is methodologically defensible. My questions are:
1. Is pooled OLS with clustered standard errors defensible as the primary estimator here, given the high share of zeros and the time-stable covariates?
2. Is pooled Tobit a meaningful robustness check, or is the left-censoring argument too weak given that zero is a genuine outcome rather than a censored value?
3. Is there an alternative estimator I may have overlooked?
Thank you in advance for any input.
Best regards,
Elias
Master's student, Politics and Administration
Aalborg University, Denmark
I am a master's student writing my thesis on variation in Danish municipalities' utilization of solar panel potential on agricultural land (2018–2025). I have read the 2019 thread on OLS vs. Tobit for solar energy installation data and found it very helpful, but my setup differs in some important ways and I would appreciate further input.
My panel covers 73 municipalities over 8 years (584 observations). The dependent variable is a utilization rate (installed MW / potential based on 1% of agricultural area). Of 584 observations, 313 are zero, and 25 municipalities have zero utilization across all 8 years. The variable is left-censored at zero by construction — utilization cannot be negative.
I have tested several estimators:
- Pooled OLS with clustered standard errors: several significant results across covariates
- Pooled Tobit (ll=0) with clustered standard errors: similar results, consistent direction and magnitude
- Poisson FE (xtpoisson, fe): drops 25 municipalities with all-zero outcomes
- PPML with absorbed fixed effects (ppmlhdfe): drops 194 observations, leaving only 47 of 73 municipalities. Results almost entirely insignificant.
An additional complication is that many of my key independent variables are time-stable or near time-stable: local political party affiliation, DK2020 climate plan membership, socioeconomic index. My rho is around 0.70-0.76 (depending on specification), meaning most variation is between municipalities rather than within. Fixed effects — whether linear or Poisson — absorb most of what I am trying to explain. The F-test confirms significant municipal heterogeneity (F = 3.51, p = 0.0000), and the Hausman test favors FE over RE (chi2 = 44.42, p = 0.0008), but FE is substantively problematic given the time-stable covariates.
My research question concerns what explains variation in the utilization rate across municipalities — so between-variation is central to the analysis. Losing a third of the sample in Poisson FE removes precisely the variation I need.
I am not attached to any particular model. I simply want to use whatever is methodologically defensible. My questions are:
1. Is pooled OLS with clustered standard errors defensible as the primary estimator here, given the high share of zeros and the time-stable covariates?
2. Is pooled Tobit a meaningful robustness check, or is the left-censoring argument too weak given that zero is a genuine outcome rather than a censored value?
3. Is there an alternative estimator I may have overlooked?
Thank you in advance for any input.
Best regards,
Elias
Master's student, Politics and Administration
Aalborg University, Denmark

Comment