xthybrid coefficient<-->odds coversion

ajay pasi

Join Date: Jan 2019
Posts: 170

xthybrid coefficient<-->odds coversion

12 Jan 2023, 05:23

Dear members,

In xthybrid results, is there a way to convert the coefficient (when depvar is binary) to odds ratio.

Code:

. xthybrid Positive_disc01    stud_SCSTOBC    Teach_SCSTOBC   Teach_nature_1 Teach_nature_2 Teach_gender_1   c
> ourse1_com course1_eco course1_eng course1_hin course1_his course1_mat course1_pol   sem_1 sem_2 sem_3 sem_4
>  sem_5 attendence_percent   , clusterid ( group_teachercasteSCSTOB_paper ) se test  p star

The variable 'Teach_SCSTOBC' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'Teach_SCSTOBC' is within clusters]
The variable 'course1_com' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'course1_com' is within clusters]
The variable 'sem_1' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_1' is within clusters]
The variable 'sem_2' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_2' is within clusters]
The variable 'sem_3' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_3' is within clusters]
The variable 'sem_4' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_4' is within clusters]
The variable 'sem_5' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_5' is within clusters]

Hybrid model. Family: gaussian. Link: identity.

+--------------------------------------+
|             Variable |     model     |
|----------------------+---------------|
| Positive_disc01      |               |
|     R__Teach_SCSTOBC |    -0.0047    |
|       R__course1_com |     0.0093    |
|             R__sem_1 |    -0.0125    |
|             R__sem_2 |    -0.0133    |
|             R__sem_3 |    -0.0122    |
|             R__sem_4 |    -0.0085    |
|             R__sem_5 |  (omitted)    |
|      W__stud_SCSTOBC |    -0.0096**  |
|    W__Teach_nature_1 |     0.0006    |
|    W__Teach_nature_2 |  (omitted)    |
|    W__Teach_gender_1 |     0.0061    |
|       W__course1_eco |     0.0064    |
|       W__course1_eng |     0.0048    |
|       W__course1_hin |    -0.0139    |
|       W__course1_his |     0.0146    |
|       W__course1_mat |     0.0155    |
|       W__course1_pol |  (omitted)    |
| W__attendence_perc~t |     0.0001    |
|      B__stud_SCSTOBC |     0.0975*   |
|    B__Teach_nature_1 |     0.0052    |
|    B__Teach_nature_2 |  (omitted)    |
|    B__Teach_gender_1 |     0.0052    |
|       B__course1_eco |     0.0357*   |
|       B__course1_eng |     0.0314*   |
|       B__course1_hin |     0.0170    |
|       B__course1_his |     0.0454**  |
|       B__course1_mat |     0.0499*** |
|       B__course1_pol |  (omitted)    |
| B__attendence_perc~t |     0.0002    |
|                _cons |    -0.0477    |
|----------------------+---------------|
|   var(_cons[g~SCS~r])|               |
|                _cons |     0.0004*** |
|----------------------+---------------|
| var(e.Positive_di~01)|               |
|                _cons |     0.0277*** |
|----------------------+---------------|
| Statistics           |               |
|                   ll |  3721.2572    |
|                 chi2 |    42.3196    |
|                    p |     0.0119    |
|                  aic | -7388.5143    |
|                  bic | -7193.5905    |
+--------------------------------------+
   Legend: * p<.05; ** p<.01; *** p<.001
Level 1: 10091 units. Level 2: 150 units.

Tests of the random effects assumption:
  _b[B__stud_SCSTOBC] = _b[W__stud_SCSTOBC]; p-value: 0.0061
  _b[B__Teach_nature_1] = _b[W__Teach_nature_1]; p-value: 0.6677
  _b[B__Teach_nature_2] = _b[W__Teach_nature_2]; p-value:      .
  _b[B__Teach_gender_1] = _b[W__Teach_gender_1]; p-value: 0.9516
  _b[B__course1_eco] = _b[W__course1_eco]; p-value: 0.2339
  _b[B__course1_eng] = _b[W__course1_eng]; p-value: 0.2815
  _b[B__course1_hin] = _b[W__course1_hin]; p-value: 0.1669
  _b[B__course1_his] = _b[W__course1_his]; p-value: 0.2736
  _b[B__course1_mat] = _b[W__course1_mat]; p-value: 0.1529
  _b[B__course1_pol] = _b[W__course1_pol]; p-value:      .
  _b[B__attendence_percent] = _b[W__attendence_percent]; p-value: 0.6808

regards,
ajay

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30103
#2

12 Jan 2023, 10:53

Not with the regression you have done. You have used a linear probability model and there is no systematic relationship between regression coefficient and odds ratio in that model.

If you want odds ratios, then you should be using a logistic regression, whereby exponentiating the coefficient will give you an odds ratio. -xthybrid- can do logistic regressions: you need to spcify the options -link(logit)- and -family(bernoulli)-.
1 like
Comment

ajay pasi

Join Date: Jan 2019
Posts: 170

12 Jan 2023, 11:24

Hello Prof Clyde,
I have done as you suggested. I guess now the xthybrid reports odds ratios, am I right prof. Clyde?

Code:

. xthybrid Positive_disc01    stud_SCST stud_OBC Teach_SCST Teach_OBC        Teach_nature_1 Teach_nature_2 Tea
> ch_gender_1   course1_com course1_eco course1_eng course1_hin course1_his course1_mat course1_pol   sem_1 se
> m_2 sem_3 sem_4 sem_5 attendence_percent   , clusterid ( group_teacherID_paper ) se test  p star link(logit)
>  family(bernoulli)

The variable 'course1_com' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'course1_com' is within clusters]
The variable 'sem_1' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_1' is within clusters]
The variable 'sem_2' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_2' is within clusters]
The variable 'sem_3' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_3' is within clusters]
The variable 'sem_4' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_4' is within clusters]
The variable 'sem_5' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_5' is within clusters]

Hybrid model. Family: bernoulli. Link: logit.

+--------------------------------------+
|             Variable |     model     |
|----------------------+---------------|
| Positive_disc01      |               |
|       R__course1_com |    -2.4203*** |
|             R__sem_1 |    -0.4151    |
|             R__sem_2 |    -0.1849    |
|             R__sem_3 |    -0.0392    |
|             R__sem_4 |     0.7021*   |
|             R__sem_5 |  (omitted)    |
|         W__stud_SCST |    -0.2015**  |
|          W__stud_OBC |    -0.1531*   |
|        W__Teach_SCST |    -0.0832    |
|         W__Teach_OBC |     0.0803    |
|    W__Teach_nature_1 |    -0.0813    |
|    W__Teach_nature_2 |  (omitted)    |
|    W__Teach_gender_1 |    -0.3224*   |
|       W__course1_eco |    -0.1369    |
|       W__course1_eng |     0.2248    |
|       W__course1_hin |    -0.3187    |
|       W__course1_his |    -0.2250    |
|       W__course1_mat |    -0.3137    |
|       W__course1_pol |  (omitted)    |
| W__attendence_perc~t |     0.0116*** |
|         B__stud_SCST |    -1.1326    |
|          B__stud_OBC |     0.5806    |
|        B__Teach_SCST |    -0.2215    |
|         B__Teach_OBC |     0.0007    |
|    B__Teach_nature_1 |    -0.2191    |
|    B__Teach_nature_2 |  (omitted)    |
|    B__Teach_gender_1 |    -0.4800    |
|       B__course1_eco |    -1.8609**  |
|       B__course1_eng |    -2.9818*** |
|       B__course1_hin |    -0.8474*   |
|       B__course1_his |    -2.3690*** |
|       B__course1_mat |    -1.9984**  |
|       B__course1_pol |  (omitted)    |
| B__attendence_perc~t |     0.0205    |
|                _cons |     1.2488    |
|----------------------+---------------|
|   var(_cons[g~ID_~r])|               |
|                _cons |     1.2114*** |
|----------------------+---------------|
| Statistics           |               |
|                   ll | -5298.9768    |
|                 chi2 |   166.0434    |
|                    p |     0.0000    |
|                  aic | 10659.9535    |
|                  bic | 10882.1763    |
+--------------------------------------+
   Legend: * p<.05; ** p<.01; *** p<.001
Level 1: 9590 units. Level 2: 132 units.

Tests of the random effects assumption:
  _b[B__stud_SCST] = _b[W__stud_SCST]; p-value: 0.5615
  _b[B__stud_OBC] = _b[W__stud_OBC]; p-value: 0.7637
  _b[B__Teach_SCST] = _b[W__Teach_SCST]; p-value: 0.7347
  _b[B__Teach_OBC] = _b[W__Teach_OBC]; p-value: 0.8461
  _b[B__Teach_nature_1] = _b[W__Teach_nature_1]; p-value: 0.6621
  _b[B__Teach_nature_2] = _b[W__Teach_nature_2]; p-value:      .
  _b[B__Teach_gender_1] = _b[W__Teach_gender_1]; p-value: 0.6971
  _b[B__course1_eco] = _b[W__course1_eco]; p-value: 0.0135
  _b[B__course1_eng] = _b[W__course1_eng]; p-value: 0.0000
  _b[B__course1_hin] = _b[W__course1_hin]; p-value: 0.4710
  _b[B__course1_his] = _b[W__course1_his]; p-value: 0.0064
  _b[B__course1_mat] = _b[W__course1_mat]; p-value: 0.0503
  _b[B__course1_pol] = _b[W__course1_pol]; p-value:      .
  _b[B__attendence_percent] = _b[W__attendence_percent]; p-value: 0.5210

Also prof. I have a question about the interpretation. W__stud_SCST | -0.2015** (is significant) whereas B__stud_SCST | -1.1326 (is not significant). How to interpret this duality? (what sense does it make, if it does?)
regards,
ajay

Last edited by ajay pasi; 12 Jan 2023, 11:32.

Comment

ajay pasi

Join Date: Jan 2019

Posts: 170
#4

12 Jan 2023, 11:40

I think, it does not report the odds yet (as minus sign in front of the estimates )!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30103
#5

12 Jan 2023, 11:47

No, these are not odds ratios. They are coefficients. To get odds ratios you have to exponentiate them, i.e., apply the -exp()- function.

The W___ prefix designates a within-group coefficient, and the B___ prefix designates a between-group coefficient. So the within teacher effect of stud_SCST is -0.2015 and the between teacher effect of stud_SCST is -1.1326. It is not uncommon for within-group and between-group effects to differ. For a clear toy example of what this looks like:

Code:

clear set obs 5 gen panel_id = _n expand 2 set seed 1234 by panel_id , sort: gen y = 4*panel_id - _n + 3 + rnormal(0, 0.5) by panel_id: gen x = panel_id + _n xtset panel_id xtreg y x, fe regress y x // GRAPH THE DATA TO SHOW WHAT'S HAPPENING separate y, by(panel_id) graph twoway connect y? x || lfit y x

As to the fact that one is "significant" and the other is "not significant," that means nothing at all. Even for people who believe in the usefulness of the concept of statistical significance (and I am not one of those) the difference between significant and not significant is not, itself, significant.

One might have a useful discussion about the precision of these effect estimates if the standard errors were included in the output. And you did request them with the -se- option. It seems, however, that -xthybrid- has a bug whereby when you also specify the -star- output, you get only the coefficients and the dreaded significance stars--all the other output is suppressed. Actually, when I use -xthybrid-, I usually use it with the -full- option. That way you get output in the same layout as you normally get from StataCorp's regression commands, which I find more helpful.
1 like
Comment

ajay pasi

Join Date: Jan 2019
Posts: 170

12 Jan 2023, 12:26

The regression you asked for--->

Code:

. xthybrid Positive_disc01    stud_SCST stud_OBC Teach_SCST Teach_OBC        Teach_nature_1 Teach_nature_2 Tea
> ch_gender_1   course1_com course1_eco course1_eng course1_hin course1_his course1_mat course1_pol   sem_1 se
> m_2 sem_3 sem_4 sem_5 attendence_percent   , clusterid ( group_teacherID_paper ) se test    link(logit) fami
> ly(bernoulli)

The variable 'course1_com' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'course1_com' is within clusters]
The variable 'sem_1' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_1' is within clusters]
The variable 'sem_2' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_2' is within clusters]
The variable 'sem_3' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_3' is within clusters]
The variable 'sem_4' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_4' is within clusters]
The variable 'sem_5' does not vary sufficiently within clusters
and will not be used to create additional regressors.
[~0% of the total variance in 'sem_5' is within clusters]

Hybrid model. Family: bernoulli. Link: logit.

+-----------------------------------+
|             Variable |   model    |
|----------------------+------------|
| Positive_disc01      |            |
|       R__course1_com |    -2.4203 |
|                      |     0.6339 |
|             R__sem_1 |    -0.4151 |
|                      |     0.4984 |
|             R__sem_2 |    -0.1849 |
|                      |     0.4647 |
|             R__sem_3 |    -0.0392 |
|                      |     0.3422 |
|             R__sem_4 |     0.7021 |
|                      |     0.3005 |
|             R__sem_5 |  (omitted) |
|                      |            |
|         W__stud_SCST |    -0.2015 |
|                      |     0.0638 |
|          W__stud_OBC |    -0.1531 |
|                      |     0.0642 |
|        W__Teach_SCST |    -0.0832 |
|                      |     0.1206 |
|         W__Teach_OBC |     0.0803 |
|                      |     0.1262 |
|    W__Teach_nature_1 |    -0.0813 |
|                      |     0.1063 |
|    W__Teach_nature_2 |  (omitted) |
|                      |            |
|    W__Teach_gender_1 |    -0.3224 |
|                      |     0.1461 |
|       W__course1_eco |    -0.1369 |
|                      |     0.3388 |
|       W__course1_eng |     0.2248 |
|                      |     0.3542 |
|       W__course1_hin |    -0.3187 |
|                      |     0.5999 |
|       W__course1_his |    -0.2250 |
|                      |     0.3871 |
|       W__course1_mat |    -0.3137 |
|                      |     0.5212 |
|       W__course1_pol |  (omitted) |
|                      |            |
| W__attendence_perc~t |     0.0116 |
|                      |     0.0015 |
|         B__stud_SCST |    -1.1326 |
|                      |     1.6022 |
|          B__stud_OBC |     0.5806 |
|                      |     2.4399 |
|        B__Teach_SCST |    -0.2215 |
|                      |     0.3900 |
|         B__Teach_OBC |     0.0007 |
|                      |     0.3902 |
|    B__Teach_nature_1 |    -0.2191 |
|                      |     0.2969 |
|    B__Teach_nature_2 |  (omitted) |
|                      |            |
|    B__Teach_gender_1 |    -0.4800 |
|                      |     0.3778 |
|       B__course1_eco |    -1.8609 |
|                      |     0.6084 |
|       B__course1_eng |    -2.9818 |
|                      |     0.5823 |
|       B__course1_hin |    -0.8474 |
|                      |     0.4211 |
|       B__course1_his |    -2.3690 |
|                      |     0.6841 |
|       B__course1_mat |    -1.9984 |
|                      |     0.6834 |
|       B__course1_pol |  (omitted) |
|                      |            |
| B__attendence_perc~t |     0.0205 |
|                      |     0.0138 |
|                _cons |     1.2488 |
|                      |     1.1457 |
|----------------------+------------|
|   var(_cons[g~ID_~r])|            |
|                _cons |     1.2114 |
|                      |     0.1854 |
|----------------------+------------|
| Statistics           |            |
|                   ll | -5298.9768 |
|                 chi2 |   166.0434 |
|                    p |     0.0000 |
|                  aic | 10659.9535 |
|                  bic | 10882.1763 |
+-----------------------------------+
                         Legend: b/se
Level 1: 9590 units. Level 2: 132 units.

Tests of the random effects assumption:
  _b[B__stud_SCST] = _b[W__stud_SCST]; p-value: 0.5615
  _b[B__stud_OBC] = _b[W__stud_OBC]; p-value: 0.7637
  _b[B__Teach_SCST] = _b[W__Teach_SCST]; p-value: 0.7347
  _b[B__Teach_OBC] = _b[W__Teach_OBC]; p-value: 0.8461
  _b[B__Teach_nature_1] = _b[W__Teach_nature_1]; p-value: 0.6621
  _b[B__Teach_nature_2] = _b[W__Teach_nature_2]; p-value:      .
  _b[B__Teach_gender_1] = _b[W__Teach_gender_1]; p-value: 0.6971
  _b[B__course1_eco] = _b[W__course1_eco]; p-value: 0.0135
  _b[B__course1_eng] = _b[W__course1_eng]; p-value: 0.0000
  _b[B__course1_hin] = _b[W__course1_hin]; p-value: 0.4710
  _b[B__course1_his] = _b[W__course1_his]; p-value: 0.0064
  _b[B__course1_mat] = _b[W__course1_mat]; p-value: 0.0503
  _b[B__course1_pol] = _b[W__course1_pol]; p-value:      .
  _b[B__attendence_percent] = _b[W__attendence_percent]; p-value: 0.5210

Last edited by ajay pasi; 12 Jan 2023, 12:45.

Comment

ajay pasi

Join Date: Jan 2019

Posts: 170
#7

12 Jan 2023, 12:31

And the graph that you asked for, is--->
clear
set obs 5
gen collegerollno = _n
expand 2
set seed 1234
by collegerollno , sort: gen y = 4* collegerollno - _n + 3 + rnormal(0, 0.5)
by collegerollno : gen x = collegerollno + _n
xtset collegerollno
xtreg y x, fe
regress y x
separate y, by( collegerollno )
graph twoway connect y? x || lfit y x

regards,
ajay
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30103
#8

12 Jan 2023, 12:48

So:

Code:

. display exp(-.2015), exp(-2.015 - 1.96*.0638), exp(-.2015 + 1.96*0.0638) .81750358 .1176492 .92639738 . display exp(-1.1326), exp(-1.1326 - 1.96*1.6022), exp(-1.1326 + 1.96*1.6022) .32219446 .01394113 7.4462608

gives us odds ratios and confidence intervals for the within-group_Teacher_ID_paper (top) and between (bottom) effects of stud_STSC on odds of positive_DISC_01.
Note: When you have the -xthybrid- results in memory, you can more easily arrive at the above results by using the -lincom- command with the -or- option.

So, for the within effect we have an odds ratio (to two decimal places) of 0.82, 95% CI 0.12, 0.93. For the between effect we have 0.32, 95% CI 0.01, 7.45. The between effect means that, on average, given two randomly selected observations from different group_teacher_ID_papers whose average values of stud_STSC differ by 1, the odds of positive DISC_01 are 0.82 times as high (18% lower) for the one with the higher value of stud_STSC as for the one with the lower value. Notice that the confidence interval here is pretty wide: the data are consistent with an astronomically strong effect (0.12) or with a rather modest one (0.93). So this effect size is measured with poor precision.

The situation for the between effect is even worse. While, on average, given two randomly selected observations from the same group_teacher_ID_papers whose average values of stud_STSC differ by 1, the odds of positive DISC_01 are 0.32 times as high (68% lower) for the one with the higher value of stud_STSD as for the one with the lower value, this confidence interval is enormous. The data are consistent with an OR anywhere between 0.01 and 7.45--that is, with an astronomically large (and, in almost any real world situation, totally implausible) effect of 99% lower odds , or with an astronomically large (and, again, in almost any real world situation, totally implausible) effect of 645% higher odds. This one is so imprecise that we cannot even assert with confidence in which direction the effect might be. And since in the real world odds ratios of 0.01 or 7.45 almost never happen, the results are completely uninformative for this between effect.

The code I showed that generated the graph was intended to show you how a within and between effect can be radically different. In this case you can see that within collegerollno, the effect of increasing x is a decrease in y, whereas between collegenrollno's, y increases with increasing x. In your data the difference between the estimated effects is not as radical as this, but I just wanted to make as clear as possible that there is no necessary connection between the value of a within-effect and a between-effect, so you should never be surprised to see such differences when you find them. The existence of such a difference, itself, requires no explanation, though in particular instances it might be interesting, or even important, to understand what drives the difference.
1 like
Comment
ajay pasi

Join Date: Jan 2019

Posts: 170
#9

12 Jan 2023, 13:32

I am still reading and understanding your post...

found a slight error (in bold)!

display exp(-.2015), exp(-.2015 - 1.96*.0638), exp(-.2015 + 1.96*0.0638)
.81750358 .72140975 .92639738
Comment
ajay pasi

Join Date: Jan 2019

Posts: 170
#10

12 Jan 2023, 13:46

Clyde sir, I read your analytical and substantiated view of my results. I should say, it is quite informative and rich. I am keeping all your suggestions stored, to proceed with further analysis. Thanks again.

regards,
ajay
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30103
#11

12 Jan 2023, 13:49

Thanks for picking up that error and posting a correction. Evidently, it changes the interpretation, as the confidence interval is now 0.72 to 0.92, which is comfortably narrow and says that the estimated OR of 0.81 is not just a wild guess but that the data really do narrow it down to something pretty close to that.
1 like
Comment
ajay pasi

Join Date: Jan 2019

Posts: 170
#12

12 Jan 2023, 13:59

yes sir.
Comment

Announcement

xthybrid coefficient<-->odds coversion

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment