Interpreting Panel data coefficient estimates where the variable doesnt change across observations

Darcy Hill

Join Date: Feb 2019

Posts: 88
#1

Interpreting Panel data coefficient estimates where the variable doesnt change across observations

02 Mar 2019, 12:01

Hi,

I am looking at the effect of different variables on the recycling rate in England.
I have 311 local authorities in England over 20 quarters and am running a regression including income, population density and household size. I have data on income and population density by quarter for each local authority.
However I was only able to obtain data for household size by year, and this does not separate by local authority it is an average for all the UK (I only have 5 values for household size).

How can I interpret the coefficient on household size?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30170
#2

02 Mar 2019, 12:45

All you can do is call it what it is: it is the effect of the average UK household size in a given year on local recycling rates. My guess is that this coefficient will prove to be pretty small, and if it doesn't, it likely is serving as a proxy for some other driving factor operating on a UK-wide scale over the time of your study.

You certainly can't interpret it as the effect of household size on that household's use of recycling.
Comment
Darcy Hill

Join Date: Feb 2019

Posts: 88
#3

03 Mar 2019, 08:32

Hi Clyde, thank you for your response. I am still a little confused, how could household size be serving as a proxy for another driving factor of recycling rates? What do you mean by this?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30170
#4

03 Mar 2019, 08:42

Well, as I don't know much about the factors driving household size, I can't give you a specific explanation for this variable. But the more general principle is that things change over time, and sometimes different things change in parallel over time even though there is no causal connection between them.

Here's an example. Suppose our outcome variable were the number of new cases of HIV infection and we are studying the years 1980 through 1995 in the US. The epidemic grew rapidly over that era. Now, as it happens, due to the baby boom, the number of people over the age of 45 also grew rapidly over that same period. If you included in the model predicting new cases of HIV infection a variable giving the number of people over age 45 in the US, you would find an association. Both were growing over that interval, even though the epidemic itself was primarily spreading in younger people; those over the age of 45 were much less likely to be affected. So in this situation the number of people over age 45 is serving as a proxy for the passage of time. And the passage of time is, itself a proxy for the trends in having large numbers of sexual partners and use of intravenous drugs during that same era--the factors which actually drove the epidemic.
1 like
Comment
Darcy Hill

Join Date: Feb 2019

Posts: 88
#5

03 Mar 2019, 10:07

This makes sense - thank you!
Comment

Announcement

Interpreting Panel data coefficient estimates where the variable doesnt change across observations

Comment

Comment

Comment

Comment