Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can you include simulated residuals with predict?

    So I'm doing a regression on some data and then want to create predicted values on a different data set, but I want a random draw of residuals included. Something like:

    Code:
    reg y x
    use other_data, clear
    predict yhat
    But I don't want the plain predicted values, I want simulated residuals in there such that roughly half of the predicted values are higher and half are lower than the plain predicted values.

    Obviously I could do this by hand or do it in mi. But is there an easier way?

  • #2
    Can it be done with the post-estimation "forecast" command? Digging into the documentation on that now. I'm pretty sure the logit models allow a simulation option, but can't remember how that works.

    I had to remind myself how to do it by hand for a simple regression, but it's easy enough:

    Code:
    reg y x
    use other_data, clear
    predict yhat
    replace yhat = yhat + rnormal(0,e(rmse))
    Of course if you are doing "robust" or anything else this is a lot harder.
    Last edited by John Eiler; 09 Sep 2022, 13:27.

    Comment


    • #3
      You are not explaining very well what you want, because by reading your first post I could never foresee the resolution in your post 2.

      You might want to try and explain better what exact you are doing and what you need.

      Comment


      • #4
        Originally posted by Joro Kolev View Post
        You are not explaining very well what you want, because by reading your first post I could never foresee the resolution in your post 2.

        You might want to try and explain better what exact you are doing and what you need.
        Sorry, I was hoping "simulated residuals" would be more obvious. Maybe an example would help. Suppose beta is 1 and there is no constant. Here is the data, with the extra column that I want:

        Code:
        y x yhat resid y_random
        1 2    2    -1      2.4
        2 2    2     0      1.6
        3 2    2     1      2.0
        yhat & resid come straight out of predict. They are deterministic. I would like a random or "simulated" column like "y_random".

        It's not too hard to do in "mi". It would just take 6 or 7 lines of code very specific to mi. It's not that bad I guess, just overkill for what seems like it could be pretty simple to implement for Stata Corp and I assume probably has been implemented if I could just find it?
        Last edited by John Eiler; 09 Sep 2022, 14:09.

        Comment


        • #5
          Meh, I guess I just gotta do mi. To put a bow on it, you can do something like:

          Code:
          mi set wide
          mi register imputed y    // where x is missing for some subset of observations
          mi register regular x
          mi impute reg y x, add(1)
          That's actually to get yhat, not residuals, but same idea. Also, to make it work you have to be a little tricky with setting x to missing, but in any event you get simulated values. I just thought there should be an easy way via "predict" to get simulated residuals instead of actual, but I guess not.
          Last edited by John Eiler; 12 Sep 2022, 11:22.

          Comment


          • #6
            John Eiler just so I understand the issue more clearly, what was the problem with your own solution in #2?

            Comment


            • #7
              Originally posted by Hemanshu Kumar View Post
              John Eiler just so I understand the issue more clearly, what was the problem with your own solution in #2?
              For a simple regression that's fine. But if you have robust standard errors, boot strapping, etc. it's way harder. E.g. what if you had something so common as "reg y x, robust"?

              Maybe that's why it's not implemented via "predict" as it would require more code to handle the different error structures whereas the current version of predict will give you the same answer for predicted values and residuals regardless of the error structure assumed. I mean, it wouldn't be that hard to implement but it's not zero effort and you can always do it by hand if you have to.
              Last edited by John Eiler; 12 Sep 2022, 13:30.

              Comment


              • #8
                Perhaps Tomz, Wittenberg, and King's clarify Stata program could get you what you want?

                Comment


                • #9
                  Originally posted by Erik Ruzek View Post
                  Perhaps Tomz, Wittenberg, and King's clarify Stata program could get you what you want?
                  Thanks, I'll take a look. Seems like the right idea based on the description.

                  Comment

                  Working...
                  X