Can you include simulated residuals with predict?

John Eiler

Join Date: Nov 2019

Posts: 50
#1

Can you include simulated residuals with predict?

09 Sep 2022, 11:44

So I'm doing a regression on some data and then want to create predicted values on a different data set, but I want a random draw of residuals included. Something like:

Code:

reg y x use other_data, clear predict yhat

But I don't want the plain predicted values, I want simulated residuals in there such that roughly half of the predicted values are higher and half are lower than the plain predicted values.

Obviously I could do this by hand or do it in mi. But is there an easier way?
Tags: None
John Eiler

Join Date: Nov 2019

Posts: 50
#2

09 Sep 2022, 13:24

Can it be done with the post-estimation "forecast" command? Digging into the documentation on that now. I'm pretty sure the logit models allow a simulation option, but can't remember how that works.

I had to remind myself how to do it by hand for a simple regression, but it's easy enough:

Code:

reg y x use other_data, clear predict yhat replace yhat = yhat + rnormal(0,e(rmse))

Of course if you are doing "robust" or anything else this is a lot harder.

Last edited by John Eiler; 09 Sep 2022, 13:27.
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#3

09 Sep 2022, 13:43

You are not explaining very well what you want, because by reading your first post I could never foresee the resolution in your post 2.

You might want to try and explain better what exact you are doing and what you need.
Comment
John Eiler

Join Date: Nov 2019

Posts: 50
#4

09 Sep 2022, 14:03

Originally posted by Joro Kolev View Post

You are not explaining very well what you want, because by reading your first post I could never foresee the resolution in your post 2.

You might want to try and explain better what exact you are doing and what you need.

Sorry, I was hoping "simulated residuals" would be more obvious. Maybe an example would help. Suppose beta is 1 and there is no constant. Here is the data, with the extra column that I want:

Code:

y x yhat resid y_random 1 2 2 -1 2.4 2 2 2 0 1.6 3 2 2 1 2.0

yhat & resid come straight out of predict. They are deterministic. I would like a random or "simulated" column like "y_random".

It's not too hard to do in "mi". It would just take 6 or 7 lines of code very specific to mi. It's not that bad I guess, just overkill for what seems like it could be pretty simple to implement for Stata Corp and I assume probably has been implemented if I could just find it?

Last edited by John Eiler; 09 Sep 2022, 14:09.
Comment
John Eiler

Join Date: Nov 2019

Posts: 50
#5

12 Sep 2022, 11:17

Meh, I guess I just gotta do mi. To put a bow on it, you can do something like:

Code:

mi set wide mi register imputed y // where x is missing for some subset of observations mi register regular x mi impute reg y x, add(1)

That's actually to get yhat, not residuals, but same idea. Also, to make it work you have to be a little tricky with setting x to missing, but in any event you get simulated values. I just thought there should be an easy way via "predict" to get simulated residuals instead of actual, but I guess not.

Last edited by John Eiler; 12 Sep 2022, 11:22.
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1396
#6

12 Sep 2022, 11:42

John Eiler just so I understand the issue more clearly, what was the problem with your own solution in #2?
Comment
John Eiler

Join Date: Nov 2019

Posts: 50
#7

12 Sep 2022, 13:25

Originally posted by Hemanshu Kumar View Post

John Eiler just so I understand the issue more clearly, what was the problem with your own solution in #2?

For a simple regression that's fine. But if you have robust standard errors, boot strapping, etc. it's way harder. E.g. what if you had something so common as "reg y x, robust"?

Maybe that's why it's not implemented via "predict" as it would require more code to handle the different error structures whereas the current version of predict will give you the same answer for predicted values and residuals regardless of the error structure assumed. I mean, it wouldn't be that hard to implement but it's not zero effort and you can always do it by hand if you have to.

Last edited by John Eiler; 12 Sep 2022, 13:30.
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 428
#8

12 Sep 2022, 13:38

Perhaps Tomz, Wittenberg, and King's clarify Stata program could get you what you want?
Comment
John Eiler

Join Date: Nov 2019

Posts: 50
#9

12 Sep 2022, 14:31

Originally posted by Erik Ruzek View Post

Perhaps Tomz, Wittenberg, and King's clarify Stata program could get you what you want?

Thanks, I'll take a look. Seems like the right idea based on the description.
Comment

Announcement

Can you include simulated residuals with predict?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment