margins following gologit2 - why are the p-values for linear predictions different to those for predicted probabilities?

Suzanna Vidmar

Join Date: Nov 2017
Posts: 27

margins following gologit2 - why are the p-values for linear predictions different to those for predicted probabilities?

02 Jul 2018, 02:54

I have an ordinal outcome variable (0=None, 1=A little, 2=A lot). Following an ordinal regression I want to calculate predicted probabilities and their confidence intervals where the outcome is "A lot":

Code:

ologit be_FAB_gp agescan_FAB i.be_ACFBAL_bi disease_ACFBAL
margins, at(disease_ACFBAL=(0(3)9) be_ACFBAL_bi=0 (mean) agescan_FAB) predict(outcome(2))

The upper bound of the last confidence interval is greater than 1:

Code:

------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P&gt;|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_at |
1 | .2147675 .0825899 2.60 0.009 .0528942 .3766407
2 | .3697546 .0766909 4.82 0.000 .2194433 .5200659
3 | .5572194 .1179079 4.73 0.000 .3261241 .7883147
4 | .7296877 .1527928 4.78 0.000 .4302195 1.029156
------------------------------------------------------------------------------

To generate confidence intervals that are in the bounds of [0,1], I calculated linear predictions (the logit), which I would then transform (anti-logit). I've constrained all the independent variables to meet the proportional odds assumption so that the results are equivalent to ologit.

Code:

gologit2 be_FAB_gp agescan_FAB be_ACFBAL_bi disease_ACFBAL,pl
margins, at(disease_ACFBAL=(0(3)9) be_ACFBAL_bi=0 (mean) agescan_FAB) predict(xb o(#2))

The p-values here are very different to those above. They both seem to be checking if the value is different from 0.

Code:

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |  -1.296424   .4897339    -2.65   0.008    -2.256285   -.3365633
          2  |  -.5332698   .3290943    -1.62   0.105    -1.178283    .1117433
          3  |   .2298846   .4778902     0.48   0.630    -.7067631    1.166532
          4  |   .9930389   .7746401     1.28   0.200    -.5252278    2.511306
------------------------------------------------------------------------------

The invlogit function on the linear predictions produces the same values as for the predicted probabilities in the first block of output above. I was planning to similarly transform the confidence intervals, but will this give me 95% CIs equivalent to those generated by the margins command for predicted probabilities?

Code:

. di invlogit(-1.296424)
.21476747

. di invlogit(-.5332698)
.36975458

. di invlogit(.2298846)
.55721938

. di invlogit(.9930389)
.72968774

Obviously there's a gap in my understanding. Any insights or pointers will be greatly appreciated.

Kind regards,
Suzanna

Tags: None

Richard Williams

Join Date: Apr 2014

Posts: 4983
#2

02 Jul 2018, 08:08

A couple of things:

Why are you using gologit2 if you are estimating an ologit model?

As far as I know, margins predicted probabilities are the same with either ologit or gologit2 with the pl option.

xb is different. Margins includes constants while ologit does not include the cutpoints. The gologit2 constants are the negative of the ologit cutpoints. So I am guessing that is what accounts for any discrepancies. Maybe you therefore want the margins xb from gologit2.

If you've got Stata 14 or higher, you really don't need to specify a specific outcome unless you want to. Margins will do all the outcomes at once.

You can see these points illustrated with

Code:

webuse nhanes2f, clear ologit health i.female weight height margins female margins female, predict(xb) gologit2 health i.female weight height, pl margins female margins female, predict(xb)

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4983
#3

02 Jul 2018, 15:41

Maybe I misunderstood your original question If you want to know why the p-value of the XB estimate differs from the p-value of the corresponding probability estimate, it is because the hypotheses that are being tested differ. Testing that xb = 0 is the same as testing that prob = .5. In this case, the test statistic is XB/seXB = -1.296424 / .4897339 = -2.65.

When you estimate the probabilities, the test stats are a test that prob = 0, which is a different hypothesis.

Maybe this is what you are doing, but to compute CIs after logit see https://www.stata.com/support/faqs/s...nce-intervals/

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Suzanna Vidmar

Join Date: Nov 2017

Posts: 27
#4

03 Jul 2018, 00:25

Thank-you Richard for your very quick response and your explanations of the commands.

I'm using Stata 15.1. You guessed correctly in your second post that what I want is to compute CIs. Where my outcome is at its highest value (be_FAB_gp=2), I'd like to produce a plot of predicted probabilities and their 95% CIs for varying amounts of disease (disease_ACFBAL = 0,1,2,3,...,9), setting scanage_FAB at its mean and be_ACFBAL_bi at 0. I had a look at the CI logit link above and it makes sense. However, I don't know how to extend that to ologit and compute the CIs I want to graph.

Best wishes,
Suzanna
Comment

Suzanna Vidmar

Join Date: Nov 2017
Posts: 27

06 Jul 2018, 04:11

After reading https://www3.nd.edu/~rwilliam/stats3/Ologit01.pdf (see p.6) and a work colleague suggesting -lincom- would be helpful, I think I've worked out how to generate the CIs for predicted probabilities from linear predictions following -ologit-. I'm also graphing the probabilities and their CIs.

My ordinal outcome has 3 categories:

None (be_FAB_gp=0)
A little (be_FAB_gp=1)
A lot (be_FAB_gp=2)

I'm setting my independent variables as follows:

disease_ACFBAL = 0, 1, 2, 3, 4, 5, 6
be_ACFBAL_bi = 0
agescan_FAB at its mean

NOTE:

I'm using Stata 15.1. "/cut1" and "/cut2" isn't recognized in V14.2.
To produce the graph I'm adding variables to my dataset, but I don't care as I'm not saving the data with these changes.
I'm only graphing the probabilities and CIs for the lowest and highest categories of my outcome, i.e. where be_FAB_gp=0 and where be_FAB_gp=2. Including be_FAB_gp=1 is superfluous (Pr(be_FAB_gp=1) = 1 - Pr(be_FAB_gp=0) - Pr(be_FAB_gp=2)) and clutters the graph.

Code:

*Running -margins- to check against the predicted probabilities I calculate later via -lincom-.
ologit be_FAB_gp agescan_FAB i.be_ACFBAL_bi disease_ACFBAL
margins, at(disease_ACFBAL=(0/6) be_ACFBAL_bi=0 (mean) agescan_FAB)

*Storing the mean scan age into a local macro.
*Doing this for complete cases in the -ologit-.
quietly su agescan_FAB if e(sample)==1
local age=r(mean)

*Generating predicted probabilities and their CIs from the linear predictors and their CIs.
ologit be_FAB_gp agescan_FAB i.be_ACFBAL_bi disease_ACFBAL

***be_FAB_gp=0
gen disease0=_n-1 in 1/7
gen p0=.
gen plb0=.
gen pub0=.
forval i=0/6 {
    local row=`i'+1
    lincom /cut1 - (agescan_FAB*`age' + disease_ACFBAL*`i')
    quietly replace p0=invlogit(r(estimate)) in `row'
    quietly replace plb0=invlogit(r(lb)) in `row'
    quietly replace pub0=invlogit(r(ub)) in `row'
}
*Listing the calculated probabilities and CIs.
l disease0 p0 plb0 pub0 in 1/7,noo

***be_FAB_gp=2
gen disease2=_n-1+0.1 in 1/7
gen p2=.
gen plb2=.
gen pub2=.
forval i=0/6 {
    local row=`i'+1
    lincom (agescan_FAB*`age' + disease_ACFBAL*`i') - /cut2
    quietly replace p2=invlogit(r(estimate)) in `row'
    quietly replace plb2=invlogit(r(lb)) in `row'
    quietly replace pub2=invlogit(r(ub)) in `row'
}
*Listing the calculated probabilities and CIs.
l disease2 p2 plb2 pub2 in 1/7,noo

scatter p0 disease0, ms(i) c(l) lp(solid) lc(gs10) || rcap plb0 pub0 disease0, lc(gs10) || ///
scatter p2 disease2, ms(i) c(l) lp(solid) lc(black) || rcap plb2 pub2 disease2, lc(black) ///
xscal(range(-0.5,6.5)) xlab(0(1)6) ylab(0(0.2)1) ///
legend(order(1 "Pr No BE at follow-up" 3 "Pr >1% BE at follow-up")) ///
ytitle(Predicted Probibility, size(medlarge)) ///
xtitle(%Diseased lung at baseline, size(medlarge))
graph export margins_disease.png,replace

Announcement

margins following gologit2 - why are the p-values for linear predictions different to those for predicted probabilities?

Comment

Comment

Comment

Comment