Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is Converting an Ordinal 'Linkert' Type Variable (0 to 10, from completely disgaree to completely agree) to a Fraction ( 0 to 1) Acceptable?

    Dear Statalist users,

    (I am running Stata17/MP)
    I have looked across the internet (and Statalist) and I haven't been able to find a straight answer to my question.

    I have multiple variables from a survey of approximately 3000 people in which they were asked questions on a scale of 0 to 10. 0 represents "Completely Disagree" and 10 represents "Completely Agree".
    I know that I can use an ordinal probit or ordinal logit regression with each of these ordinal variables as the dependent variable. However, I want to try and keep the interpretation simple, which oprobit and ologit tend to make it difficult to convey. Additionally, when running the ordinal probit/logit models I find that some of my variables fail to pass the parallel lines assumption.

    Code:
            
    . ologit B5 female lowage loweduc lowincome
    
    Iteration 0:   log likelihood = -6433.0606  
    Iteration 1:   log likelihood = -6417.1843  
    Iteration 2:   log likelihood = -6417.1752  
    Iteration 3:   log likelihood = -6417.1752  
    
    Ordered logistic regression        Number of obs =  3,002
            LR chi2(4)    =  31.77
            Prob > chi2   = 0.0000
    Log likelihood = -6417.1752        Pseudo R2     = 0.0025
    
            
    B5  Coefficient  Std. err.      z    P>z    [95% conf. interval]
            
    female   -.1533167   .0678278    -2.26    0.024    -.2862567   -.0203768
    lowage   -.1846502   .1240223    -1.49    0.137    -.4277295    .0584291
    loweduc   -.2329156   .0690683    -3.37    0.001    -.3682871   -.0975442
    lowincome   -.1648473   .0744144    -2.22    0.027    -.3106968   -.0189978
            
    /cut1   -3.408178    .107404        -3.618686    -3.19767
    /cut2   -3.209815   .1007814        -3.407343   -3.012287
    /cut3    -2.78373   .0892951        -2.958745   -2.608714
    /cut4   -2.357594   .0808882        -2.516132   -2.199056
    /cut5   -1.955083   .0751846        -2.102442   -1.807724
    /cut6   -.8541042    .066344        -.984136   -.7240723
    /cut7    -.366501   .0645431        -.4930031   -.2399989
    /cut8    .2682416   .0641922        .1424272     .394056
    /cut9    1.140942   .0682417        1.007191    1.274693
    /cut10    1.699653   .0746818        1.55328    1.846027
    
                  
    . brant, details
    
    Estimated coefficients from binary logits
    
                            
    Variable   y_gt_0     y_gt_1     y_gt_2    y_gt_3     y_gt_4    y_gt_5    y_gt_6    y_gt_7    y_gt_8    y_gt_9  
                            
    female     0.212      0.184      0.118    0.033      0.021    -0.208    -0.203    -0.164    -0.143    -0.229  
    1.12       1.06       0.81    0.27       0.20    -2.53    -2.60    -2.05    -1.50    -2.01  
    lowage     0.326      0.390      0.203    0.082     -0.183    -0.324    -0.159    -0.173    -0.132    -0.056  
    0.82       1.05       0.71    0.36      -1.00    -2.26    -1.12    -1.15    -0.73    -0.26  
    loweduc    -0.426     -0.329     -0.439    -0.199     -0.090    -0.331    -0.287    -0.186    -0.206    -0.086  
    -2.19      -1.85      -2.98    -1.60      -0.84    -4.04    -3.63    -2.26    -2.06    -0.72  
    lowincome    -0.403     -0.389     -0.301    -0.330     -0.304    -0.186    -0.187    -0.234    0.043    0.289  
    -2.04      -2.14      -1.98    -2.57      -2.72    -2.14    -2.22    -2.64    0.41    2.34  
    _cons     3.317      3.078      2.724    2.256      1.823    0.952    0.429    -0.256    -1.214    -1.848  
    18.33      18.77      19.63    19.49      18.41    12.41    5.99    -3.55    -14.20    -17.95  
                            
                            Legend: b/t
    
    Brant test of parallel regression assumption
    
    chi2     p>chi2      df
    
    All       79.23      0.000      36
    
    female       10.54      0.308       9
    lowage        9.44      0.398       9
    loweduc       24.82      0.003       9
    lowincome       27.60      0.001       9
    
    A significant test statistic provides evidence    that the parallel
    regression assumption has been violated.
    I have come across the use of fractional probit/logit regressions and I am wondering if it is acceptable to convert these 'Likert' type variables into fractions between 0 and 1 (by simply dividing by 10).

    For example, B5 is the original variable (coded from 0 to 10) and B5_1 is the fraction of it. The other variables are: female is a binary variable for gender, lowage is a binary variable for age (1 for 18 to 24-year-olds and 0 for older), loweduc is binary (1 for primary education and 0 for more), lowincome is binary as well

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double B5 float B5_1 byte(female lowage loweduc lowincome)
     3 .3 1 1 0 0
     8 .8 1 0 1 1
     7 .7 1 0 1 0
     5 .5 0 0 0 1
     5 .5 1 0 1 1
     5 .5 1 0 1 1
     1 .1 0 0 0 0
     4 .4 0 0 0 0
     4 .4 0 0 0 0
     2 .2 0 0 1 0
     6 .6 0 0 0 0
     8 .8 1 0 1 1
     5 .5 1 0 0 0
     5 .5 1 0 1 1
     3 .3 0 0 0 0
     6 .6 1 0 1 1
     5 .5 1 0 0 0
     5 .5 0 0 1 0
     3 .3 1 0 0 0
     0  0 0 0 1 0
     8 .8 0 0 0 1
     9 .9 1 0 1 1
     4 .4 0 0 1 1
     7 .7 0 0 1 0
     5 .5 1 0 0 0
     7 .7 0 0 1 1
     6 .6 1 0 1 1
    10  1 1 0 1 1
     5 .5 1 0 1 1
     6 .6 1 0 1 1
    end
    label values female genderl
    label def genderl 0 "Male", modify
    label def genderl 1 "Female", modify


    Code:
    eststo: oprobit B5 female lowage loweduc lowincome, vce(robust)
    
    
    eststo: fracreg probit B5_1 female lowage loweduc lowincome, vce(robust)

    After all that, is this an appropriate transformation of this data and if so. Do you have any recommendations for what postestimation tests I should conduct?

    Your assistance is much appreciated.
    Last edited by Max Baard; 29 Nov 2022, 08:18. Reason: oprobit

  • #2
    There isn't a straight answer to this for good reasons. What is "acceptable" is not a constant, even within sub-sub-disciplines. The problem is not waffle or incompetence, although those can be found, but a range of judgments.

    I've treated ordinal scores as if numeric whenever it seemed reasonable but not otherwise. That is a totally vacuous answer for anyone else: the point is that looking at the data to see how they behave, thinking about goals, and thinking about who is the readership are all key to a decision.

    In almost any University some academics in some departments are firm that only certain analyses are defensible with ordinal data, while most of the rest of the academics engage in decisions based on some kind of weighted average of roughly ordinal grades originally based on personal judgment. (Often they have no alternative; it's a University ruling to work with a grade point average or some local equivalent.)

    There are feedback loops that aren't always benign. In various Earth and environmental sciences I know something about, ordinal logit and probit models have been suggested from time to time but haven't really taken off. There is a high chance that a paper including them would be reviewed by people who are unfamiliar with or even hostile to them. (Unfamiliarity and hostility often go hand in hand.)

    Conversely, in various pockets of social science they have become routine and there is repeated ritual display of lengthy tables of coefficients and P-values and stars of various kinds.

    Comment

    Working...
    X