Dear members,
I have a binary dependent variable and want to estimate either a logistic regression or a LPM. My key explanatory variable (a measure of exposure to specific media content) has many zeros, some medium values and few extremely high values. See this summary statistic of the explanatory variable as an example:
As you see the maximum value is more than 20 times the standard deviation of the variable. Because there are many zeros I cannot log-transform the variable. Do you have any ideas in how far this is could be a probem? Should I use a transformation to the variable? Does it affect the choice between LPM and logistic model? E.g. is logistic regression more robust to skewed distriubtions?
I have a binary dependent variable and want to estimate either a logistic regression or a LPM. My key explanatory variable (a measure of exposure to specific media content) has many zeros, some medium values and few extremely high values. See this summary statistic of the explanatory variable as an example:
Code:
Percentiles Smallest 1% 0 0 5% 0 0 10% 0 0 Obs 21,169 25% 0 0 Sum of Wgt. 21,169 50% 0 Mean .3175061 Largest Std. Dev. 1.190602 75% 0 21 90% .8571429 23 Variance 1.417533 95% 2 25.28572 Skewness 8.337951 99% 5.428571 31.57143 Kurtosis 110.2057
Comment