Hello,
I have a dataset of panel data with circa 5,000 observations. However, 3 independent variables (of the total 8) have some missing values. The amount of missing data per variable is around 100 to 200 observations and is sparsely distributed across the entire dataset (many non-missing values before and after the missing data points). I'm reading up on what the preferable way would be in handling these. Linear interpolation seems an option, but I'm having difficulty seeing what other variables the independent variables are a function of; multiple imputation seems somewhat overboard for this small amount of missing data; and mean replacement seems to be commonly advised against. Do you have any suggestions as for tackling this problem?
To be clear, this is regarding an economic dataset, where volatilities of stocks are studied using independent variables in the conditional variance of a GARCH model.
Thank you for your time and any help you can offer.
I have a dataset of panel data with circa 5,000 observations. However, 3 independent variables (of the total 8) have some missing values. The amount of missing data per variable is around 100 to 200 observations and is sparsely distributed across the entire dataset (many non-missing values before and after the missing data points). I'm reading up on what the preferable way would be in handling these. Linear interpolation seems an option, but I'm having difficulty seeing what other variables the independent variables are a function of; multiple imputation seems somewhat overboard for this small amount of missing data; and mean replacement seems to be commonly advised against. Do you have any suggestions as for tackling this problem?
To be clear, this is regarding an economic dataset, where volatilities of stocks are studied using independent variables in the conditional variance of a GARCH model.
Thank you for your time and any help you can offer.
Comment