Dear Statalist,
I’m running a panel regression on a data set with financial data using industry measures as control variables (e.g. industry growth rate measured by sales, industry ROA). Since I have missing data in variables other than those which I use for calculating the industry measures, I’m wondering if I should drop them BEFORE or AFTER calculating the industry variables.
If I calculated the industry variables after deleting all cases with missing data, I would lose some information because some oft the dropped cases might have influenced the level oft the industry ROA or growth rate, right?
On the other hand, I would use cases for calculating the industry measures even though they won’t be used in the actual regression analysis. This seems a bit strange to me, too.
Can you help me with what would be the appropriate procedure? Is there a guideline?
Thank you!
Best, Kathie
I’m running a panel regression on a data set with financial data using industry measures as control variables (e.g. industry growth rate measured by sales, industry ROA). Since I have missing data in variables other than those which I use for calculating the industry measures, I’m wondering if I should drop them BEFORE or AFTER calculating the industry variables.
If I calculated the industry variables after deleting all cases with missing data, I would lose some information because some oft the dropped cases might have influenced the level oft the industry ROA or growth rate, right?
On the other hand, I would use cases for calculating the industry measures even though they won’t be used in the actual regression analysis. This seems a bit strange to me, too.
Can you help me with what would be the appropriate procedure? Is there a guideline?
Thank you!
Best, Kathie
Comment