Hi, I have a panel dataset in which the main outcome variable is house prices and I want to include population as one of the control variables. Population itself is not interesting as this varies in magnitude across individual regions. I tried including population as the percentage change in population since the previous year (my time period), however this framework is less than ideal because house prices are a variable that can be slow to respond to other variables. Furthermore, in most areas there is a secular trend of increasing population, even if some years are negative, and therefore it would seem that year over year population increase would give a worse fit compared to population increase over a base year. The approach I think best to take is to include population change over my base year, which is the year before the main sample period and to include several lags of it. My population variable is therefore,
Populationi,t - Populationi, base_year / Populationi, base_year
Will this approach be problematic? One potential issue I can foresee is that the interpretation of this variable changes along with the window of the sample period.
Thanks
Populationi,t - Populationi, base_year / Populationi, base_year
Will this approach be problematic? One potential issue I can foresee is that the interpretation of this variable changes along with the window of the sample period.
Thanks
Comment