Dear Statalist users,
(I am running Stata17/MP)
I have looked across the internet (and Statalist) and I haven't been able to find a straight answer to my question.
I have multiple variables from a survey of approximately 3000 people in which they were asked questions on a scale of 0 to 10. 0 represents "Completely Disagree" and 10 represents "Completely Agree".
I know that I can use an ordinal probit or ordinal logit regression with each of these ordinal variables as the dependent variable. However, I want to try and keep the interpretation simple, which oprobit and ologit tend to make it difficult to convey. Additionally, when running the ordinal probit/logit models I find that some of my variables fail to pass the parallel lines assumption.
I have come across the use of fractional probit/logit regressions and I am wondering if it is acceptable to convert these 'Likert' type variables into fractions between 0 and 1 (by simply dividing by 10).
For example, B5 is the original variable (coded from 0 to 10) and B5_1 is the fraction of it. The other variables are: female is a binary variable for gender, lowage is a binary variable for age (1 for 18 to 24-year-olds and 0 for older), loweduc is binary (1 for primary education and 0 for more), lowincome is binary as well
After all that, is this an appropriate transformation of this data and if so. Do you have any recommendations for what postestimation tests I should conduct?
Your assistance is much appreciated.
(I am running Stata17/MP)
I have looked across the internet (and Statalist) and I haven't been able to find a straight answer to my question.
I have multiple variables from a survey of approximately 3000 people in which they were asked questions on a scale of 0 to 10. 0 represents "Completely Disagree" and 10 represents "Completely Agree".
I know that I can use an ordinal probit or ordinal logit regression with each of these ordinal variables as the dependent variable. However, I want to try and keep the interpretation simple, which oprobit and ologit tend to make it difficult to convey. Additionally, when running the ordinal probit/logit models I find that some of my variables fail to pass the parallel lines assumption.
Code:
. ologit B5 female lowage loweduc lowincome Iteration 0: log likelihood = -6433.0606 Iteration 1: log likelihood = -6417.1843 Iteration 2: log likelihood = -6417.1752 Iteration 3: log likelihood = -6417.1752 Ordered logistic regression Number of obs = 3,002 LR chi2(4) = 31.77 Prob > chi2 = 0.0000 Log likelihood = -6417.1752 Pseudo R2 = 0.0025 B5 Coefficient Std. err. z P>z [95% conf. interval] female -.1533167 .0678278 -2.26 0.024 -.2862567 -.0203768 lowage -.1846502 .1240223 -1.49 0.137 -.4277295 .0584291 loweduc -.2329156 .0690683 -3.37 0.001 -.3682871 -.0975442 lowincome -.1648473 .0744144 -2.22 0.027 -.3106968 -.0189978 /cut1 -3.408178 .107404 -3.618686 -3.19767 /cut2 -3.209815 .1007814 -3.407343 -3.012287 /cut3 -2.78373 .0892951 -2.958745 -2.608714 /cut4 -2.357594 .0808882 -2.516132 -2.199056 /cut5 -1.955083 .0751846 -2.102442 -1.807724 /cut6 -.8541042 .066344 -.984136 -.7240723 /cut7 -.366501 .0645431 -.4930031 -.2399989 /cut8 .2682416 .0641922 .1424272 .394056 /cut9 1.140942 .0682417 1.007191 1.274693 /cut10 1.699653 .0746818 1.55328 1.846027 . brant, details Estimated coefficients from binary logits Variable y_gt_0 y_gt_1 y_gt_2 y_gt_3 y_gt_4 y_gt_5 y_gt_6 y_gt_7 y_gt_8 y_gt_9 female 0.212 0.184 0.118 0.033 0.021 -0.208 -0.203 -0.164 -0.143 -0.229 1.12 1.06 0.81 0.27 0.20 -2.53 -2.60 -2.05 -1.50 -2.01 lowage 0.326 0.390 0.203 0.082 -0.183 -0.324 -0.159 -0.173 -0.132 -0.056 0.82 1.05 0.71 0.36 -1.00 -2.26 -1.12 -1.15 -0.73 -0.26 loweduc -0.426 -0.329 -0.439 -0.199 -0.090 -0.331 -0.287 -0.186 -0.206 -0.086 -2.19 -1.85 -2.98 -1.60 -0.84 -4.04 -3.63 -2.26 -2.06 -0.72 lowincome -0.403 -0.389 -0.301 -0.330 -0.304 -0.186 -0.187 -0.234 0.043 0.289 -2.04 -2.14 -1.98 -2.57 -2.72 -2.14 -2.22 -2.64 0.41 2.34 _cons 3.317 3.078 2.724 2.256 1.823 0.952 0.429 -0.256 -1.214 -1.848 18.33 18.77 19.63 19.49 18.41 12.41 5.99 -3.55 -14.20 -17.95 Legend: b/t Brant test of parallel regression assumption chi2 p>chi2 df All 79.23 0.000 36 female 10.54 0.308 9 lowage 9.44 0.398 9 loweduc 24.82 0.003 9 lowincome 27.60 0.001 9 A significant test statistic provides evidence that the parallel regression assumption has been violated.
For example, B5 is the original variable (coded from 0 to 10) and B5_1 is the fraction of it. The other variables are: female is a binary variable for gender, lowage is a binary variable for age (1 for 18 to 24-year-olds and 0 for older), loweduc is binary (1 for primary education and 0 for more), lowincome is binary as well
Code:
* Example generated by -dataex-. For more info, type help dataex clear input double B5 float B5_1 byte(female lowage loweduc lowincome) 3 .3 1 1 0 0 8 .8 1 0 1 1 7 .7 1 0 1 0 5 .5 0 0 0 1 5 .5 1 0 1 1 5 .5 1 0 1 1 1 .1 0 0 0 0 4 .4 0 0 0 0 4 .4 0 0 0 0 2 .2 0 0 1 0 6 .6 0 0 0 0 8 .8 1 0 1 1 5 .5 1 0 0 0 5 .5 1 0 1 1 3 .3 0 0 0 0 6 .6 1 0 1 1 5 .5 1 0 0 0 5 .5 0 0 1 0 3 .3 1 0 0 0 0 0 0 0 1 0 8 .8 0 0 0 1 9 .9 1 0 1 1 4 .4 0 0 1 1 7 .7 0 0 1 0 5 .5 1 0 0 0 7 .7 0 0 1 1 6 .6 1 0 1 1 10 1 1 0 1 1 5 .5 1 0 1 1 6 .6 1 0 1 1 end label values female genderl label def genderl 0 "Male", modify label def genderl 1 "Female", modify
Code:
eststo: oprobit B5 female lowage loweduc lowincome, vce(robust) eststo: fracreg probit B5_1 female lowage loweduc lowincome, vce(robust)
After all that, is this an appropriate transformation of this data and if so. Do you have any recommendations for what postestimation tests I should conduct?
Your assistance is much appreciated.
Comment