Logistic Regression w/ mildly significant Dummy Variable

Rich Colton

Join Date: Jan 2018

Posts: 6
#1

Logistic Regression w/ mildly significant Dummy Variable

02 Jan 2018, 09:36

Hello all,

Wondering if I can get some guidance from those more informed that I.

My regression results are below. I believe they show relatively strong evidence that the independent variables have non-zero effects, correct?

My main query concerns the inclusion of the less significant dummy "SAC" variables. Specifically, sac2 and sac4.

There are 6 "sac" types in the data set, 1-6, and I have gone about creating the 5 sac type dummy variables where if Sac type equals 4, then sac4 equals 1 otherwise it equals 0, and so on.

What considerations would one make in deciding whether it was reasonable to include sac2 and sac4 in the model? My thought at this point is that there is some evidence of significance and an argument can be made that it would be logical for the variable to be significant. Must I exclude the variables or can it be reasonable to retain mildly insignificant variables when others in the set of dummy variables are significant?

Thanks for any help provided!

. logit imp csmin csmin2 tds2 etlrt2 ltv2 minage2 sac1 sac2 sac3 sac4 sac5 if funded==1

Iteration 0: log likelihood = -411.67704
Iteration 1: log likelihood = -383.59557
Iteration 2: log likelihood = -370.40427
Iteration 3: log likelihood = -368.72347
Iteration 4: log likelihood = -368.68813
Iteration 5: log likelihood = -368.68812

Logistic regression Number of obs = 2,001
LR chi2(11) = 85.98
Prob > chi2 = 0.0000
Log likelihood = -368.68812 Pseudo R2 = 0.1044

------------------------------------------------------------------------------
imp | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
csmin | .0323853 .0081642 3.97 0.000 .0163838 .0483868
csmin2 | -.0000339 7.59e-06 -4.47 0.000 -.0000488 -.000019
tds2 | .0230267 .0107251 2.15 0.032 .0020059 .0440475
etlrt2 | -3.15592 1.072801 -2.94 0.003 -5.25857 -1.053269
ltv2 | -3.380574 1.511915 -2.24 0.025 -6.343873 -.4172758
minage2 | .0002754 .0000821 3.35 0.001 .0001145 .0004363
sac1 | -.8622301 .4260937 -2.02 0.043 -1.697358 -.0271017
sac2 | -.73401 .4647683 -1.58 0.114 -1.644939 .1769192
sac3 | -.9593358 .4495987 -2.13 0.033 -1.840533 -.0781384
sac4 | -.9076598 .4818127 -1.88 0.060 -1.851995 .0366756
sac5 | -1.402417 .4309326 -3.25 0.001 -2.247029 -.5578043
_cons | -5.753695 2.571938 -2.24 0.025 -10.7946 -.7127887
------------------------------------------------------------------------------
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#2

02 Jan 2018, 10:18

Rich:
welcome to this forum.
Statistical significance is (too) often oversold.
Hence, I would retain the mildly insignificant predictors, which may well be as informative as the significant ones.

Kind regards,
Carlo
(Stata 19.0)
Comment
Rich Colton

Join Date: Jan 2018

Posts: 6
#3

02 Jan 2018, 10:33

Much appreciated and thank you for the warm welcome!
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1133
#4

02 Jan 2018, 11:41

Rich, there is no need to compute a set of dummy variables. If you have a variable called sac with values 1-6, you can direct Stata to treat it as a factor variable (i.e., categorical variable) by adding i. as prefix. E.g., assuming you have a variable called sac with values 1-6:

Code:

logit imp csmin csmin2 tds2 etlrt2 ltv2 minage2 i.sac if funded==1

For more info:

Code:

help fvvarlist

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)
1 like
Comment

Announcement

Logistic Regression w/ mildly significant Dummy Variable

Comment

Comment

Comment