Dear fellow Stata users,
I am dealing with panel data and my dependent variable is binary reflecting employment or unemployment of an individual. Because working with xtprobit/xtlogit is quite time consuming as the convergence takes hours, I am trying to argue that LPM (OLS) brings the similar results, because the purpose is not to predict but to analyse the effect. I am aware of the limitations of using LPM/OLS with dependent variables., namely the violation of the linearity and homoscedasticity assumption. I was therefore wondering if specifying a saturated model would help with the linearity issue. Hence, I transform my specification so that every explanatory variable is a dummy indicator -also the continuous variables, which I classify into three quantiles, for low, medium or high.
I have two concerns at this stage. First, my key explanatory variable is a continuous variable (unemployment rate) and cannot be transformed into dummy indicators, can I still argue that LPM can be used in a "quasi-saturated" model? Second, by transforming the other continuos variables into three dummy variables, I am losing efficiency, is there a better way to tackle this issue?
Any suggestion is welcome and please let me know if further information is needed.
Thank you in advance,
Ruth.
I am dealing with panel data and my dependent variable is binary reflecting employment or unemployment of an individual. Because working with xtprobit/xtlogit is quite time consuming as the convergence takes hours, I am trying to argue that LPM (OLS) brings the similar results, because the purpose is not to predict but to analyse the effect. I am aware of the limitations of using LPM/OLS with dependent variables., namely the violation of the linearity and homoscedasticity assumption. I was therefore wondering if specifying a saturated model would help with the linearity issue. Hence, I transform my specification so that every explanatory variable is a dummy indicator -also the continuous variables, which I classify into three quantiles, for low, medium or high.
I have two concerns at this stage. First, my key explanatory variable is a continuous variable (unemployment rate) and cannot be transformed into dummy indicators, can I still argue that LPM can be used in a "quasi-saturated" model? Second, by transforming the other continuos variables into three dummy variables, I am losing efficiency, is there a better way to tackle this issue?
Any suggestion is welcome and please let me know if further information is needed.
Thank you in advance,
Ruth.