Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to find data points lying above regression line within distance less than max[1SE, 1mm]?


    Hello Statalists,

    I have data points (units in mm) and have performed regression on it. Now I need to find the data points whose distance above the regression line is less than a certain criteria i.e. max[1SE, 1mm]. max[..]is the maximum of parameters in bracket, SE is standard error of regression, defined as square root of residual mean square. As I am novice to stata, I am not very familiar with its commands. So, if anyone can help me in writing the code for the following, it shall be a big help.

    Thanks in advance!

  • #2
    So it sounds like you want to do something like this:

    Code:
    run your regression here
    predict resid, resid
    gen byte wanted = (resid < max(e(rmse), 1))

    Comment


    • #3
      Thanks Clyde! Can you pls explain the command for me to understand the purpose of byte wanted. Also, does max(e(rmse), 1) include both the conditions, i.e., (i) 1 standard error and (ii) distance of 1mm from regression line.

      Thanks in advance!

      Comment


      • #4
        You did not specify in your original question how you wanted to "find" the points you wanted. So I chose to respond by creating a new variable, which I gave the name "wanted" which would be 1 for those points that met your conditions, and 0 for those that did not. The mention of "byte" in the command is completely optional and just tells Stata to only use 1 byte of storage to hold that variable. You could omit that, and the only difference is that Stata would use more memory--but not enough that you would notice the difference. It's just an old habit of mine: I started computer programming a long time ago when memory was scarce and very expensive, so I formed habits of minimizing the amount of it I used. In the modern world this sort of thing isn't really necessary.

        And yes, the condition max(e(rmse), 1) includes both conditions. 1 speaks for itself. e(rmse) is where the root mean squared error is found after a regression.

        Comment


        • #5
          Got it and it worked! Thank you very much Clyde. It's been a big help as I have been stuck with this for so long!
          Two more things that I would like to
          know are: Instead of 1standard error, if I need to change it to 2standard error, will it be written as: gen byte wanted = (lp < max(e(rmse2), 1)) . If not, what would be the correct way to write it.

          Also, can you tell me that how can I find the slope of regression line.

          Thanks in advance!

          Comment


          • #6
            Code:
            gen byte wanted = (lp < max(2*e(rmse), 1))
            Also, can you tell me that how can I find the slope of regression line.
            Run -help regress-. Near the top of the window that opens there is a link "(View complete PDF manual entry)." Click on that to open the PDF manual section on the -regress- command. Click on the Remarks and Examples link and read the first example That will explain how to read the output you get.

            Comment

            Working...
            X