dealing with nominal variables in OLS regression

Fahim Ahmad

Join Date: Jan 2016

Posts: 48
#1

dealing with nominal variables in OLS regression

18 Jun 2017, 04:43

Dear all, I am using a national perception survey data to find out how corruption is linked with sympathy for anti government groups.
I am using OLS regression, I also want to control for some demographics too, for ex: region and province.
The region and province variables are nominal which have 8 and 34 unique values, respectively.

Can I use the following command

Code:

reg sympathy corruption region province

Or do I have to create dummy variables for each levels of controlling variables (region and province)?

Any idea please.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#2

18 Jun 2017, 04:48

Fahim:
as -fvvarlist- does all the job on your behalf, try:

Code:

reg sympathy corruption i.region i.province

Closing-out remark. as your data reveal some nesting structure (eg: provinces are probably nested within regions), you should also consider a -mixed- model instead of an OLS.

Kind regards,
Carlo
(Stata 19.0)
Comment
Fahim Ahmad

Join Date: Jan 2016

Posts: 48
#3

18 Jun 2017, 05:04

Thanks a lot Carlo Lazarro,
Region and provinces are only for example to describe my question, ofcourse i won't use both of them as controlling variable at the same : )

When I am using

Code:

reg sympathy corruption i.region

It shows the coefficients for 7 regions only and omit one region.
Can you please tell why it happens, what is the idea behind it.

Thanks,
Fahim
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#4

18 Jun 2017, 05:25

Fahim:
the omission of one out 7 regions is correct in that it avoids the so called dummy trap (https://en.wikipedia.org/wiki/Dummy_...le_(statistics)).

Kind regards,
Carlo
(Stata 19.0)
Comment
Fahim Ahmad

Join Date: Jan 2016

Posts: 48
#5

18 Jun 2017, 05:45

This is really usefull Carlo!
I aplogies for asking too many questions.

Variable label of my dependent variable is as below:

1 no sympathy at all
2 a little sympathy
3 a lot of sympathy

And the "region" variable label is:

1 Central Kabul
2 North
3 South
4 East
5 West
6 North West
7 South West
8 Central Hazarajt

While i run the command i mentioned above the coefficient for all regions is positive, so in this case the region which has the lowest coefficient has the lowest sympathy for anti government groups and vise-versa, right?
And how to know level of sympathy on the region which will be omitted by using this command.

Really appreciate your help.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#6

18 Jun 2017, 06:24

Fahim:
if your depvar is categorical and ordered, you should consider -ologit- instead of -regress-;
q1) yes, when adjusted for the other predictors (but see my comment above);
q2)

Code:

tab symphathy if region==<thenumber_indentifying_the_excluded_region>

Kind regards,
Carlo
(Stata 19.0)
Comment

Fahim Ahmad

Join Date: Jan 2016
Posts: 48

18 Jun 2017, 22:35

My mistake, yes you are right Carlo!
I must use ordered logistic regression, here is the result of my model.

Code:

. ologit sympathy gender corruption Army Police    urban_rural i.region    [aw=w]

(sum of wgt is   9.3960e+03)
Iteration 0:   log likelihood = -5727.5333  
Iteration 1:   log likelihood = -5433.1136  
Iteration 2:   log likelihood = -5413.1129  
Iteration 3:   log likelihood = -5412.9249  
Iteration 4:   log likelihood = -5412.9248  

Ordered logistic regression    Number of obs     =    9,473
    LR chi2(12)       =    536.11
    Prob > chi2       =    0.0000
Log likelihood = -5412.9248    Pseudo R2         =    0.0472 






sympathy
Coef.
Std. Err.
z
P>z
[95% Conf.
Interval]






gender
.1598654
.0556266
2.87
0.004
.0508392
.2688916

corruption
.4048046
.0385252
10.51
0.000
.3292966
.4803127

Army
-.2192776
.046173
-4.75
0.000
-.309775
-.1287803

Police
-.2895502
.0448472
-6.46
0.000
-.3774491
-.2016513

urban_rural
-.5638942
.0776629
-7.26
0.000
-.7161107
-.4116777



region

East
.5382542
.1075157
5.01
0.000
.3275272
.7489811

South East
.2869898
.1059743
2.71
0.007
.079284
.4946956

South West
.4224143
.1030393
4.10
0.000
.2204609
.6243677

West
.1515146
.1003236
1.51
0.131
-.045116
.3481452

North East
-.0689522
.1022639
-0.67
0.500
-.2693858
.1314815

Central
/ Hazarajat
-1.326957
.2825237
-4.70
0.000
-1.880694
-.7732211

North West
.0585194
.1013457
0.58
0.564
-.1401146
.2571534






/cut1
.1259794
.1991583

-.2643638
.5163226

/cut2
1.411804
.2008946

1.018058
1.80555

What I actually look for is to find out how corruption linked with having sympathy for anti government groups, by controlling demographic (gender, place of residents, and region), and level of respondents confidence of Police and Army.

The labels for mentioned variables are below.

var1 ) sympathy: measure of level of sympathy for anti government groups

1 no sympathy at all
2 a little sympathy
3 a lot of sympathy

var2) corruption : measure of number of times a person experienced corruption in government institutions (it is a scale which has constructed from 10 variables), the label are below

1 in no cases
2 in some cases
3 in most cases
4 in all cases

var3 ) gender : 1 female 2 male

var4 ) urban_rural: shows place of residents of respondents , 1 rural 2 urban

var5 ) Army: measures level of respondent's confidence for National Army, it is a scale which has constructed from 3 variables, the labels are below:

1 a lot of confidence
2 somewhat confidence
3 a little confidence
4 no confidence at all

var 5) Police: measures level of respondents confidence for National Police, it is also a scale which has constructed from 3 variables, the labels are same as labels for "Army" variable.

There was a lot of papers on how to interpret coefficients of ologit model, basically what i found is that coefficients must interpreted in terms of odds ratio, since I am new with this model i don't understand what the odds ratio means exactly.

BTW, looking back to the model, the coefficient for gender is (.1598) and for corruption is (.4048), can i say that males and those who have experienced corruption are more likely to have sympathy for anti government groups.

Many thanks.

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#8

18 Jun 2017, 23:43

Fahim:
take a llok at -ologit- and -ologit postestmation- for comprehensive answers to all your questions.

Kind regards,
Carlo
(Stata 19.0)
Comment
Fahim Ahmad

Join Date: Jan 2016

Posts: 48
#9

19 Jun 2017, 21:32

You are a great help for Stata list users, many thanks Carlo!
Comment

Announcement