Predict values after "skewnreg" command

John Hanser

Join Date: Jan 2018

Posts: 24
#1

Predict values after "skewnreg" command

01 Dec 2022, 05:51

Dear Statalisters,

I am struggling to solve an arguably simple problem:
I have a variable (min:0, max: 100) that is drawn from a skew-normal distribution.

I want to fit a skew-normal distribution model and compute the fitted values for the integers from 0 to 100.

There is an excellent user-written set of commands for skew-normal distributions (ssc install st0207), in particular, the skewnreg and skewrplot command.

However, I can't figure out how to get this list of hypothetical values between 0 and 100.

I might be falsely using the predict command?

Here is a simple example of my attempts with the auto dataset:

Code:

sysuse auto, replace // Run skew-normal-regression ** ssc install st0207 skewnreg mpg // Show fitted values vs. histogram skewrplot, fitted // Replace variable values with hypothetical range set obs 101 egen range = seq(), f(0) t(100) replace mpg = range // Predict values from fitted model? predict mpg_fitted

Any hint is highly appreciated!

Best regards,
John

Last edited by John Hanser; 01 Dec 2022, 05:56.
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 36054
#2

01 Dec 2022, 06:56

Some confusion here. skewnreg is from the Stata Journal and can't be found on SSC at all, so the commented out ssc command won't work.

From what else I understand I get the impression that you want to stretch predicted values to cover the range from 0 to 100 and what's more for those to be integers too.

An immediate difficulty is that your example just puts a constant into the predicted variable, as there are no predictors.

Perhaps your real problem has predictors, and you want to scale to [0, 100] which could be

Code:

predict predicted su predicted gen wanted = 100 * (predicted - r(max)) / (r(max) - r(min))

and then an application of round() produces integers.

Or what you want are percentiles of the predicted response.

The more I think about it, the less I understand what you are doing and trying to do. Sorry this won't help much, and someone else may be able to help more.

Last edited by Nick Cox; 01 Dec 2022, 07:07.
1 like
Comment
John Hanser

Join Date: Jan 2018

Posts: 24
#3

01 Dec 2022, 07:27

Thank you for the quick reply, Nick.

Sorry that I didn't make myself clear. Please ignore the integer part of the question.

Let me write a few lines about my motivation; maybe that clarifies things:

I want to be able to answer the question: "How likely is it to draw a number between X and X+1 from the process that generated my observations of variable mpg?".
Since I don't know the exact number-generating process, I looked at the histogram, and a skew-normal distribution looked suitable.
Then, I try to fit a skew-normal model with "skewnreg". The resulting "skewrplot" suggests a good fit.

I basically want to access the values of the blue line. Additionally, I want to obtain the estimated probability"values for non-plotted values up to 100.

Using "predict" might be a totally wrong approach.

Hope that clarifies my question.

Best,
John

Last edited by John Hanser; 01 Dec 2022, 07:30.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 36054
#4

01 Dec 2022, 07:59

That makes your problem clearer. Although it's presumably not your real problem, the small print from skewnreg mpg indicates that the fit is suspect.

What you're trying to do strikes me as very tricky statistically. Although trying to read off the cumulative distribution function from the data is tricky -- real data comes with lumps and gaps that don't mean much usually -- you could fit any number of loosely plausible skewed distributions to data like mpg and get different answers.

But I've never wanted any version of your problem statistically, so I won't try to lay down precepts on what will work best.
Comment
John Hanser

Join Date: Jan 2018

Posts: 24
#5

01 Dec 2022, 10:17

Thanks, Nick! Note that the auto data was just an example. My actual data fits substantially better.

Hope that someone might have a suggestion...
Comment

Announcement

Predict values after "skewnreg" command

Comment

Comment

Comment

Comment