Hi,
I am working on a large panel dataset composed of 30 countries over a 25-years period, and have included several macroeconomic variables in the dataset (e.g.: GDP, Gini, GDP per capita, average education attainment, life expectancy, etc.). My panel is unbalanced as there are several missing values for the various variables over the different countries. Following the advices received in this forum, and following my own research on the subject, I have interpolated these missing values using three alternative methods: ipolate (combined to epolate), pchipolate and cipolate. I now have a balanced panel, which was the whole purpose of these manipulation. However, I would like to know if there are any recommended ways to validate which interpolation methods is best to fit my data. I am aware that there are sometimes underlying theories as to identify and select the best method; however, in this case, I have several missing values for some variable, particularly the middle-class size and there does not seem to be any theory as per how I should interpolate it. The next step to my analysis, to give a bit of a context, is to run panel regressions. Thank you in advance for the help!
I am working on a large panel dataset composed of 30 countries over a 25-years period, and have included several macroeconomic variables in the dataset (e.g.: GDP, Gini, GDP per capita, average education attainment, life expectancy, etc.). My panel is unbalanced as there are several missing values for the various variables over the different countries. Following the advices received in this forum, and following my own research on the subject, I have interpolated these missing values using three alternative methods: ipolate (combined to epolate), pchipolate and cipolate. I now have a balanced panel, which was the whole purpose of these manipulation. However, I would like to know if there are any recommended ways to validate which interpolation methods is best to fit my data. I am aware that there are sometimes underlying theories as to identify and select the best method; however, in this case, I have several missing values for some variable, particularly the middle-class size and there does not seem to be any theory as per how I should interpolate it. The next step to my analysis, to give a bit of a context, is to run panel regressions. Thank you in advance for the help!
Comment