Announcement
Collapse
No announcement yet.
X

Potential Stata (MLwiN/Estout) bug  Output of Odds Ratios' Significance Levels
Hi all,
I think I may have inadvertently stumbled upon a bug after using the runmlwin command for the Stata MLwiN addon (from the Centre for Multilevel Modelling) and outputting the results via the esttout package.
In short the (perceived) bug is that after using the runmlwin command to conduct Bayesian analyses on a dataset of > 400,000 with a binary DV, I get the results of these analyses in my Stata console window. When I output these results as odds ratios (OR) to a word document using the esttab command (part of the estout package) the significance levels of a handful of variables changes wildly for, at least as far as I can tell, no logical reason. In other words, a few key variables have completely different significance levels in the output of the results in the Stata console window when compared to the output of the results in the document that is produced using the esttab command. Again, as far as I can see this happens for no logical reason unless I am being ignorant and missing something stupendously obvious. This occurs across a number of models that I run and is not restricted to a single use case.
Interestingly however, when I simply output my results as their raw coefficients that haven't been exponentiated to odds ratio to an external document via the esttab command, the significance levels of the variables are 'correct' in that they concur with the pvalues that the Stata console window outputs. Unfortunately my main laptop which I run my State estimations on is currently away being repaired so I can't accompany my post with screenshots and copies of these outputs currently.
It should be relatively straightforward to determine whether this is a bug or not at the moment however  although where the bug occurs/what addin is causing it is a different problem to diagnose entirely. Quite simply, if there is no theoretical reason for the pvalues of variables to change when a mean for a variable (in line with Bayesian MCMC modelling) is outputted as an oddsratio as opposed to a raw coefficient, then this can only be a bug and, if this is the case, then I can get on why trying to diagnose where it lies (whether that be within Stata itself, the esttout addon, or the runmlwin package)  any guidance that anybody could offer regarding this would be greatly appreciated.
Many thanks,
Tags: None

Hi William,
Thank you for your reply and my apologies for not getting back to you sooner, I did not see that there was an additional response to this thread.
There's been some severe delays in getting my laptop back on which I run my regressions. I will be getting it back tomorrow and I will post a comprehensive response to all of the points that you've raised by the end of next week.
Thanks again,

It is difficult to understand your description, and I agree with Attaullah Shah's request. In particular, among the sources that you list in post #1 for this problem, you omit the possibility of a misunderstanding or error on your part in creating the code you ran. In that context, agreeing that "there is no theoretical explanation" lends at least implicit agreement to the assertion that there is a bug "within Stata itself, the esttout addon, or the runmlwin package".
With that said, is the following an accurate restatement of your path? use runmlwin to fit your model, the results appear in Stata's Results window
 store the results:
Code:
estimates store estimations
 output the results to a text file, with significance levels:
Code:
esttab estimations using file1, p
 the results in file1 agree with the results in Stata's Results window in step 1
 display the results in Stata's Results window, as odds ratios with significance levels:
Code:
esttab, p eform
 the displayed results include significance levels that "differ wildly" from the significance levels produced in steps 1 and 3
 output the results to a text file, as odds ratios with significance levels:
Code:
esttab using file2, p eform
 the results in file2 include significance levels that agree with those in step 5 and "differ wildly" from the significance levels produced in steps 1 and 3
Below is an attempt to produce a reproducible example that does not use MLwiN, but instead uses a standard logistic regression on a binary dependent variable. It does not display the problem you describe. Because I am not confident I understand the sequence of commands you issued to achieve your results, I can't ascribe any cause to this "failure to fail", as it were.
Code:cls clear all webuse lbw logit low age lwt i.race smoke ptl ht ui estimates store est esttab est using e1doc, p replace type e1doc.txt esttab, eform p esttab using e2doc, p eform replace type e2doc.txt
Leave a comment:

Thank you for your reply Attaullah Shah. The laptop that I run Stata on is being repaired at the moment which I mentioned in my original post so unfortunately I cannot post a sample of my dataset etc. The core of my question is of more of a theoretical than technical nature at the moment  I have broken it down into a few steps below for clarity; see point 5. for the question itself.
1. The Stata console window displays results of my post Bayesian estimation (using runmlwin command from MLwiN package) including the pvalues for variables.
2. After storing these results usingCode:est store estimations
Code:esttab estimations using "C:\Research[...]
3a. Using the esttab command to convert the previous estimation results to odds ratiosCode:esttab, eform
3b. However, upon doing so the resulting output from step 3a. produces significance levels for some variables which differ wildly from those significance levels displayed for the same variables obtained from steps 1. and 2. above.
4. The values obtained from step 3b. are consistent when outputting them to an external doc using the esttab command from the estout packageCode:esttab estimations using "C:\Research[...], eform
5. Therefore, my question is whether there is any theoretical reason for significance levels of variables changing if those variables simply undergo a basic log transformation to obtain their odds ratios? Put differently, does  or should  transforming a variable from its raw coefficient to its corresponding odds ratio change the value of a variable's associated pvalue (and, thus, its significance level)?
My understanding is there there is not any theoretical explanation for 5. And, if that's true, then this is most certainly a bug I've come across.
However, If there is a theoretical explanation for why converting the raw coefficient of a variable to its odds ratio impacts said variable's corresponding pvalue then there's obviously no bug.
Kind regards,
Leave a comment:

1. Please use dataex(from SSC) to show a sample of your dataset
2. Post the Stata commands that you used to generate the results and the error you are referring to
These will increases the chances of a helpful reply from the Statalist members.
Leave a comment:
Leave a comment: