OLS for Likert scale

John Galvin

Join Date: Feb 2019

Posts: 38
#1

OLS for Likert scale

17 Aug 2019, 06:03

Can I use OLS regression for 10-point Likert scale dependent variable?

According to Fielding (1999), Greene (2000) and Daykin and Moffatt (2002), linear regression analysis of ordered outcomes (Hellevik, 2009) creates statistical complexity and misinformation in results interpretation. Much of the available literature on survey analysis presumes logistic or probit regressions. However, the majority of available survey research papers on public attitudes applies probit type of analysis, which follows in line with Daykin and Moffatt (2002) argument that probit regression suites for ordinal data.

My supervisor, who is professor in Public Opinion advised me to use OLS if I struggle with probit regression analysis, however I still conerned about using OLS, what do you think?

Last edited by John Galvin; 17 Aug 2019, 06:35.
Tags: None
Stephen Jenkins

Join Date: Apr 2014

Posts: 1435
#2

17 Aug 2019, 09:18

help oprobit and help ologit, and also read the manual entries. Ordered probit and logit models are, in principle, better than OLS for ordered categorical data, but some argue that OLS is easier to apply and gives much the same results (whatever that actually means!). To me, interpretations of oprobit and ologit estimates are fairly straightforward to derive, especially in conjunction with deployment of margins. (See help oprobit_postestimation.) It'd be easy to compare estimates and implications derived using (say) oprobit and regress
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30114
#3

17 Aug 2019, 09:45

Feeling loquacious at the moment, I will indulge in a digression. What I say here is not really off-topic or tangential, but it is abstract and, in the end, provides no concrete guidance for Mr. Galvin. Feel free to skip this response.

Whether it is appropriate to treat responses to a Likert-like item as representing interval data (which might make OLS suitable) is a highly controversial subject. Opinions about this tend to be vehemently held, much like religious beliefs--probably because there is no real way to be sure!

If you believe that the responses that label each of the numbers in the scale represent equally spaced perceptions of agreement (or whatever latent response to a stimulus you are trying to measure), then it is perfectly reasonable to treat the result as an interval level variable. In the case of the classic 5-point Likert response set, strongly agree, agree, neither agree nor disagree, disagree, strongly disagree, many people (but not all) find this belief to be quite reasonable. I think it is obvious, however, that there is really no way to know. I do not know what the labels on your 10-point response set are, but if nothing else, I would guess that it is harder to come up with phrases that actually succeed in dividing some latent perceptual dimension into 10 equally spaced bins than to find 5 such. So I'd be more skeptical with a 10-point Likert response set, but I wouldn't rule it out completely.

I should also point out that in practice, Likert-items (and Likert-like items) tend to be used in batches of items that are expected to be highly correlated or measure the same construct, with the responses across a group of such items (a "scale") totaled or averaged to produce a "scale score." This practice is so widespread that it is seldom remarked upon. But it implicitly assumes that the individual items responses are interval-level.

I do believe that one of the motivations for the development of item response theory (a topic I have only a superficial understanding of) was to overcome the need to rely on "equally spaced" assumptions and the like, replacing them with data-based interpretations.
1 like
Comment
John Galvin

Join Date: Feb 2019

Posts: 38
#4

17 Aug 2019, 09:48

Originally posted by Stephen Jenkins View Post

help oprobit and help ologit, and also read the manual entries. Ordered probit and logit models are, in principle, better than OLS for ordered categorical data, but some argue that OLS is easier to apply and gives much the same results (whatever that actually means!). To me, interpretations of oprobit and ologit estimates are fairly straightforward to derive, especially in conjunction with deployment of margins. (See help oprobit_postestimation.) It'd be easy to compare estimates and implications derived using (say) oprobit and regress

Dear Steven, the only one issue I am concerned about is to measure probability of a range of outcome. Precisely I would like to measure the probability of outcome 6-10 (on 10 point scale DV). I know that I can measure the probability of particular one outcome and several one outcomes, but separately. Do you know how to measure a probability of outcomes range?

regards,
John
Comment
ericmelse

Join Date: May 2014

Posts: 434
#5

17 Aug 2019, 10:48

Dear John,

I recommend this paper by Richard Williams, Fitting heterogeneous choice models with oglm, The Stata Journal (2010) 10 4. 540-567, in which he discusses in Example 3 the issues and modelling alternatives that might possibly be helpful for your project.

http://publicationslist.org/eric.melse
Comment
Jackson Monroe

Join Date: Jul 2019

Posts: 60
#6

17 Aug 2019, 11:01

Originally posted by John Galvin View Post

Dear Steven, the only one issue I am concerned about is to measure probability of a range of outcome. Precisely I would like to measure the probability of outcome 6-10 (on 10 point scale DV). I know that I can measure the probability of particular one outcome and several one outcomes, but separately. Do you know how to measure a probability of outcomes range?

regards,
John

If you are only interested in the probability of being in two bins on your ten bin scale can you not treat it as a two bin scale and use probit logit etc.?
Comment
John Galvin

Join Date: Feb 2019

Posts: 38
#7

17 Aug 2019, 11:52

Originally posted by Jackson Monroe View Post

If you are only interested in the probability of being in two bins on your ten bin scale can you not treat it as a two bin scale and use probit logit etc.?

Exactly, I am struggling with analysis of second bin probability, do you know how to do this?
Comment
Jackson Monroe

Join Date: Jul 2019

Posts: 60
#8

17 Aug 2019, 12:02

Originally posted by John Galvin View Post

Exactly, I am struggling with analysis of second bin probability, do you know how to do this?

If two bins is a sufficient model then it is possible to use OLS (linear probability model) to do inference. The downside is depending on your data you may get some illogical predictions, probability of being in a bin greater than 1 or less than 0, which are not possible in a logit or probit analysis. In stata the regress command would be for linear regression, the probit and logit commands would be for probit and logit. Are you asking about interpretation, or how to set things up?
Comment

Joseph Coveney

Join Date: Apr 2014
Posts: 4420

17 Aug 2019, 18:36

Originally posted by John Galvin View Post

I would like to measure the probability of outcome 6-10 (on 10 point scale DV). . . . Do you know how to measure a probability of outcomes range?

See below. To keep things simple, I didn't put in any predictors (independent variables), but even if you do, you can use -margins- in the same manner as shown.

Code:

version 16.0

clear *

set seed `=strreverse("1512652")'
quietly set obs 200

generate byte out = runiformint(1, 10)

*
* Begin here
*

// Using -oprobit- on original data
oprobit out , nolog

quietly margins, post

local elements
forvalues i = 6/10 {
    local elements `elements' _b[`i'._predict] +
}
lincom `elements' 0

// Binning and using -probit-, as Jackson Monroe suggests in #6
generate byte six2ten = out >= 6 if !mi(out)
probit six2ten , nolog
margins

exit

Comment

John Galvin

Join Date: Feb 2019
Posts: 38

#10

19 Aug 2019, 06:57

Originally posted by Joseph Coveney View Post

See below. To keep things simple, I didn't put in any predictors (independent variables), but even if you do, you can use -margins- in the same manner as shown.

Code:

version 16.0

clear *

set seed `=strreverse("1512652")'
quietly set obs 200

generate byte out = runiformint(1, 10)

*
* Begin here
*

// Using -oprobit- on original data
oprobit out , nolog

quietly margins, post

local elements
forvalues i = 6/10 {
local elements `elements' _b[`i'._predict] +
}
lincom `elements' 0

// Binning and using -probit-, as Jackson Monroe suggests in #6
generate byte six2ten = out >= 6 if !mi(out)
probit six2ten , nolog
margins

exit

Thank you very much for the code! This is exactly what I was looking for!

Kinds regards,
John

Announcement