How to find data points lying above regression line within distance less than max[1SE, 1mm]?

Akb Bk

Join Date: Dec 2018

Posts: 3
#1

How to find data points lying above regression line within distance less than max[1SE, 1mm]?

18 Dec 2018, 11:12

Hello Statalists,

I have data points (units in mm) and have performed regression on it. Now I need to find the data points whose distance above the regression line is less than a certain criteria i.e. max[1SE, 1mm]. max[..]is the maximum of parameters in bracket, SE is standard error of regression, defined as square root of residual mean square. As I am novice to stata, I am not very familiar with its commands. So, if anyone can help me in writing the code for the following, it shall be a big help.

Thanks in advance!
Tags: graph, regression
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

18 Dec 2018, 11:48

So it sounds like you want to do something like this:

Code:

run your regression here predict resid, resid gen byte wanted = (resid < max(e(rmse), 1))
2 likes
Comment
Akb Bk

Join Date: Dec 2018

Posts: 3
#3

18 Dec 2018, 12:05

Thanks Clyde! Can you pls explain the command for me to understand the purpose of byte wanted. Also, does max(e(rmse), 1) include both the conditions, i.e., (i) 1 standard error and (ii) distance of 1mm from regression line.

Thanks in advance!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#4

18 Dec 2018, 13:38

You did not specify in your original question how you wanted to "find" the points you wanted. So I chose to respond by creating a new variable, which I gave the name "wanted" which would be 1 for those points that met your conditions, and 0 for those that did not. The mention of "byte" in the command is completely optional and just tells Stata to only use 1 byte of storage to hold that variable. You could omit that, and the only difference is that Stata would use more memory--but not enough that you would notice the difference. It's just an old habit of mine: I started computer programming a long time ago when memory was scarce and very expensive, so I formed habits of minimizing the amount of it I used. In the modern world this sort of thing isn't really necessary.

And yes, the condition max(e(rmse), 1) includes both conditions. 1 speaks for itself. e(rmse) is where the root mean squared error is found after a regression.
2 likes
Comment
Akb Bk

Join Date: Dec 2018

Posts: 3
#5

18 Dec 2018, 23:08

Got it and it worked! Thank you very much Clyde. It's been a big help as I have been stuck with this for so long!
Two more things that I would like to
know are: Instead of 1standard error, if I need to change it to 2standard error, will it be written as: gen byte wanted = (lp < max(e(rmse2), 1)) . If not, what would be the correct way to write it.

Also, can you tell me that how can I find the slope of regression line.

Thanks in advance!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#6

18 Dec 2018, 23:19

Code:

gen byte wanted = (lp < max(2*e(rmse), 1))

Also, can you tell me that how can I find the slope of regression line.

Run -help regress-. Near the top of the window that opens there is a link "(View complete PDF manual entry)." Click on that to open the PDF manual section on the -regress- command. Click on the Remarks and Examples link and read the first example That will explain how to read the output you get.
2 likes
Comment

Announcement

How to find data points lying above regression line within distance less than max[1SE, 1mm]?

Comment

Comment

Comment

Comment

Comment