Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • -xtreg- estimates that are close to zero change each time the command is run

    I am running commands of the form xtreg y x. This yields different coefficient estimates for x each time it is run: -3.22e-17, 9.20e-18, etc. Running an equivalent regression with areg y x, a(z) yields the same estimate every time: 8.00e-17. I think the true estimate is zero, but I'd be happy with 8.00e-17.

    The reason I care is that sometimes the coefficient is rounded to positive zero, and sometimes it is rounded to negative zero. This means the regression table gets changed in the repository even when it shouldn't be, with the minus symbol getting added or removed at random. Has anyone else dealt with this issue?

  • #2
    Maximum Likelihood estimators can get different estimates depending on the way they set the initial values and how they iterate. In general, these differences are so slight you'd never notice, but in your case, they're noticeable and annoying. Try "set seed 12345" (where 12345 is an arbitrary number) before running/re-running your models. If you set the seed, it should get the same estimates each time.

    Comment


    • #3
      Can show us code to replicate this problem?

      Best
      Daniel

      Comment


      • #4
        Originally posted by ben earnhart View Post
        Try "set seed 12345" (where 12345 is an arbitrary number) before running/re-running your models. If you set the seed, it should get the same estimates each time.
        Yep. Thanks. I didn't realize that I needed to set a seed in every do file; I thought setting it once in the parent do-file would suffice. I'm running a bunch of regressions in child do-files, each of which is called from a parent do-file. What is the best practice for my situation? Choose a different random seed for each child do file? That seems like overkill. I guess it probably doesn't matter much as long as I'm not doing anything important with randomness.

        Comment


        • #5
          I'm not an expert, but as far as I know it doesn't matter what you set as the seed, and there is no reason to deliberately change the seed. For a well-written program and a model it will allow to converge, the seed shouldn't matter out to the very tiny digits. If setting different seeds were to lead to seriously different results, then I'd think there was something wrong with the program and/or the model.

          Comment


          • #6
            This thread might be misleading and is incomplete.

            By comparison with areg the initial post implies that xtreg ,fe has been used, which is implemented using OLS not ML. The syntax provided in the initial post lacks the fe option, however, which implies a (standard) random-effects model has been fit. This model is implemented as the default xtreg ,re and is based on a (F)GLS estimator - not an ML estimator either.

            If ben's solution works, then the original call must have been xtreg ,mle.

            Best
            Daniel

            Comment


            • #7
              Daniel, to be more specific, my command is xtreg y i.post##i.treatment a if (b == 1) & (c == 2), fe i(z) vce(cl w) nonest dfadj. Ben's solution does work.

              Comment


              • #8
                That is very interesting, Nils. Thanks for getting back.

                Has anyone an explanation why setting the seed affects xtreg ,fe where, as far as I can see, the estimation is done by _regress, which from the documentation I understand to be an implementation of OLS? Does this have something to do with how the X matrix is inverted, internally? Where is the "random" element, controlled by seed, coming from?

                Edit:

                After reading a bit in the manuals and ado-files, ​I am almost certain that while Ben's solution works, it does not work because of the reasons given. I do not see where an ML estimator is involved here. A more likely explanation is, that Nils has some other random-based elements somewhere in his do-files that cause the results to differ. I would be highly surprised if setting a seed would indeed affect _regress.

                This might even indicate a bug in the do-files. See http://www.stata.com/statalist/archi.../msg00582.html for related discussion.

                Best
                Daniel
                Last edited by daniel klein; 06 Jan 2015, 15:45.

                Comment


                • #9
                  Daniel, you're correct. Sorry. I didn't test Ben's proposal thoroughly enough. Without -set seed- I thought I was getting positive zero about half the time, but it's actually more like 10% of the time. I reran my code with -set seed- something like twenty times, and when the sign didn't change I assumed that setting a seed was working. That was a bad assumption.

                  I have attached data and code to reproduce the issue. I discovered that sorting the data on the fixed effect variable partially resolves the issue, but I don't understand why. When the data is sorted on the fixed effect variable, the reported coefficient no longer changes with each re-estimation. However, the value of that "stable" coefficient does change each time I rerun this entire code snippet.

                  Code:
                  version 13.1
                  
                  set seed 155575 // Appears to make no difference whether this is included.
                  
                  capture program drop display_xtreg_coef
                  program define display_xtreg_coef
                  forval i = 1/10 {
                      qui xtreg y i.post##i.treatment, fe i(z)
                      display _b[1.post]
                  }
                  end
                  
                  insheet using ~/Desktop/reprodata.txt, clear
                  display_xtreg_coef
                  sort z
                  display_xtreg_coef
                  Attached Files

                  Comment


                  • #10
                    Hm ... I have no explanation for this. The seed does not affect the results, but, as I mentioned, this does not surprise me. What is surprising to me is, that the results appear to affected by the sorting of the dataset. Why does this happen?

                    If nobody comes up with an explanation, this might well be something Stata Tech-Support would be interested in.

                    By the way, I can reproduce this with Stata 12.1, fully updated, on a Win-7 machine.

                    Best
                    Daniel

                    Comment


                    • #11
                      Thanks for helping, Daniel. I apologize for leading you on a goose chase earlier.

                      Comment


                      • #12
                        Yes, and I have reproduced it with Stata 13.1, fully updated, on a Win-7 machine. I agree that Tech Support should be notified of this.

                        Comment


                        • #13
                          Stata tech support has identified the issue and will roll out the fix in a future update. My workaround above can be improved by changing -sort- to -sort, stable-. This way the estimates will be the same not just between re-estimations but also between entire code runs. In fact, it sounds like the way this will be officially fixed is to add -sort, stable-. Thank you to Gustavo Sanchez at Stata and to an unnamed developer for resolving the issue.

                          Comment


                          • #14
                            Nils - what was the explanation provided by Stata Tech Support? And does "officially fixed" mean that there is a fix to xtreg on the way?

                            Comment


                            • #15
                              Mark: The preliminary explanation was that expanding factor variables used an unstable sort. As I understand it, changing it to a stable sort resolved the issue. An update will contain that change.

                              Comment

                              Working...
                              X