Correspondence between the statistical significant effect of coefficients in a table and marginal effects in a graph

Luis Ortiz

Join Date: Dec 2014

Posts: 97
#1

Correspondence between the statistical significant effect of coefficients in a table and marginal effects in a graph

12 Dec 2019, 12:34

Dear members of the list,

For a sample of highly educated young adults in 24 countries that participated in PIAAC survey, I estimate the probability of attaining a master level degree instead of a bachelor level one. Thus, my dependent variable is dichotomous. My key independent variable is father's education, which is a variable with three categories, corresponding to basic, intermediate and higher education. Controlling for gender and age, I want to estimate the effect of father's education on the attainment of a higher level degree (master) instead of a lower level one among the individuals in the sample. This is my model:

PHP Code:

xtmelogit univ i.edufath female age || cntryid3:

In principle, my results show that father's education has an statistically significant effect on the probability of attaining a master level degree instead of a bachelor level one. See coefficients corresponding to ISCED 3/4 and ISCED 5/6 in the following table:

PHP Code:

Fitting comparison model: Iteration 0: log likelihood = -12825.158 Iteration 1: log likelihood = -12512.74 Iteration 2: log likelihood = -12511.102 Iteration 3: log likelihood = -12511.102 Fitting full model: tau = 0.0 log likelihood = -12511.102 tau = 0.1 log likelihood = -11306.365 tau = 0.2 log likelihood = -11278.014 tau = 0.3 log likelihood = -11276.239 tau = 0.4 log likelihood = -11317.484 Iteration 0: log likelihood = -11258.67 Iteration 1: log likelihood = -11162.029 Iteration 2: log likelihood = -11153.553 Iteration 3: log likelihood = -11153.329 Iteration 4: log likelihood = -11153.329 Random-effects logistic regression Number of obs = 19,663 Group variable: cntryid3 Number of groups = 24 Random effects u_i ~ Gaussian Obs per group: min = 320 avg = 819.3 max = 3,302 Integration method: mvaghermite Integration pts. = 12 Wald chi2(4) = 340.00 Log likelihood = -11153.329 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ master | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- edufather | ISCED 3/4 | .1350686 .0466318 2.90 0.004 .0436719 .2264653 ISCED 5/6 | .6315997 .0445544 14.18 0.000 .5442747 .7189247 | female | -.0768024 .0331973 -2.31 0.021 -.1418679 -.011737 age | .0336828 .003584 9.40 0.000 .0266582 .0407073 _cons | -2.447529 .3277209 -7.47 0.000 -3.08985 -1.805208 -------------+---------------------------------------------------------------- /lnsig2u | .6325732 .304019 .0367069 1.228439 -------------+---------------------------------------------------------------- sigma_u | 1.372023 .2085606 1.018523 1.848214 rho | .3639468 .0703772 .2397336 .5093969 ------------------------------------------------------------------------------ LR test of rho=0: chibar2(01) = 2715.55 Prob >= chibar2 = 0.000

Next, I proceed to estimate the average marginal effect of different categories of father's education on the probability of attaining a master level degree instead of a bachelor one:

PHP Code:

margins edufather, predict(mu fixedonly) vsquish level(95) post marginsplot

I do not understand why, if the effect is statistically significant in the results (table above), the confidence intervals in the graph overlap. See next:

Is there anyone who could help me to understand the correspondence between the statistical significance of the coefficients in the table and the overlap of the confidence intervals in the graph? Which one of these results should I credit?

Many thanks for your attention

Kind regards

Luis Ortiz
Attached Files

Forstatalist.png (0, 0 views)

Last edited by Luis Ortiz; 12 Dec 2019, 12:58.
Tags: confidence intervals, graph, Marginal Effects, statistical significance
Clyde Schechter

Join Date: Apr 2014

Posts: 30191
#2

12 Dec 2019, 15:25

There are several reasons for this. It is entirely possible for the difference between two estimates, each of which is rather imprecisely estimated by the data, to nevertheless be very precisely estimated. In the language of statistical significance this translates to: there is nothing surprising about overlapping confidence intervals for things that have a statistically significant difference. It happens frequently. The regression coefficients you are seeing in the output are, in a different metric, estimates of the differences.

"In a different metric" is also in play here. The predicted margins are probabilities, the coefficients are log odds ratios. They are related to each other rather distantly. First there is a non-linear transformation from the coefficients to predicted values for individual observations, and then there is a lot of averaging of those individual predicted values so as to take into account the base outcome probabilities. A very large regression coefficient (or odds ratio) can correspond to a very small probability difference if we are starting from a very large or very small probability.

Finally, by doing the margins to predict only the fixed portion of the model, you are adding yet more distance between the margins output and the regression results.

In short, there are many reasons why the two things you are looking at are so different, and there is little reason to expect them to come out in a similar way.

All that said, you are also passing this through the statistical significance. This imposes an arbitrary dichotomous classification on inherently continuous p-values and, in general, leads people to all sorts of paradoxes. This is just one of the many reasons that The American Statistical Association has recommended that the concept of statistical significance be abandoned. See https://www.tandfonline.com/doi/full...5.2019.1583913 for the "executive summary" and
https://www.tandfonline.com/toc/utas20/73/sup1 for all 43 supporting articles. Or https://www.nature.com/articles/d41586-019-00857-9 for the tl;dr..
1 like
Comment
Luis Ortiz

Join Date: Dec 2014

Posts: 97
#3

12 Dec 2019, 16:31

Thanks for your so rich answer, Clyde. It is really informative

Your answer me realizaing the distance between the coefficients in my table and the predictive margins that I plot. Could using 'margins, contrast' be a better way of approximating the difference in the predicted probability of attaining an MA vs a BA betweeh the categories of the key independent variable:

PHP Code:

xtmelogit master i.edufath female age || cntryid3: margins r.edufather marginsplot

Kind regards

Luis
Comment

Announcement

Correspondence between the statistical significant effect of coefficients in a table and marginal effects in a graph

Comment

Comment