Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Store intercept and coefficient of the regression then use it to generate new variable

    Dear all
    I have a panel dataset as follow:

    ID: a a a a b b b b c c c
    time: 1 2 3 4 1 2 3 4 1 2 3
    x value: 5 6 7 8 1 2 3 4 3 4 5
    y value: 9 11 12 17 7 8 9 10 7 6 5 4

    First I need to run a regression for each ID with x as independence and y as dependence variables. Then I need to generate residuals, which is the difference between the estimated y from the regression and the actual y.
    Is there any ways to to this without manually type in the intercepts and coefficients?
    Thank you in advanced

  • #2
    See [R] predict as well as [R] regress postestimation. You can start with:

    Code:
    help predict
    Stata/MP 14.1 (64-bit x86-64)
    Revision 19 May 2016
    Win 8.1

    Comment


    • #3
      Thanks Carole.
      Predict seems work but it can only use for the last regression. In my sample I have many IDs and I need to run the regression for each ID. If I use Predict after each regression, it would be time-consuming so I would like to know if there is any other solutions.

      Comment


      • #4
        Predict seems work but it can only use for the last regression.
        I'm guessing that you've done this as

        Code:
        by id, sort: regress y x
        predict resid, resid
        which would, as you note, only give you residuals for the last regression. But you can do it this way:

        Code:
        levelsof id, local(ids)
        gen residual = .
        foreach j of local ids {
            regress y x if id == `"`j'"'
            predict temp, resid
            replace residual = temp if id == `"`j'"'
            drop temp
        }

        Comment


        • #5
          Thank you Clyde for your kind help.
          I already tried your code and the results came out but it is different from my calculation. One more thing is that the results showed up but my computer kept running the code forever and I couldn't enter any command after that. ( I only did a trial with 5 ids and 720 observations). I don't use loop very often in Stata so I'm not quite understand why it can happend.

          Comment


          • #6
            Mia, You can double check your math and see the way Stata does its calculations by the following (ignore id's for a moment, though you can generalize this)

            Code:
            reg y x
            predict stata_residual, res
            predict stata_yhat, xb
            
            gen my_yhat= _b[_cons] + _b[x]*x
            gen my_residual= y-yhat
            
            list stata_yhat my_yhat stata_residual my_residual in 1/10
            You can compare these to the way that you have calculated your own residuals. There should be no difference in my_residual and stata_residual (beyond rounding).
            Stata/MP 14.1 (64-bit x86-64)
            Revision 19 May 2016
            Win 8.1

            Comment


            • #7
              I got it now. Thank you very much Carole and Clyde.

              Comment


              • #8

                Hi, Mr. Clyde Schechter
                I have little confusion in understanding the code symbols like "id", "local ids" and "j" , in the code stated below..

                levelsof id, local(ids)
                gen residual = .
                foreach j of local ids
                {
                regress y x if id == `"`j'"'
                predict temp, resid
                replace residual = temp if id == `"`j'"'
                drop temp
                }


                Could you please help me in this regard,
                Thanks

                Moreover, I have similar panel data like Mia Pham.. i.e.
                ID
                time
                x values
                y values

                .....
                Best regards
                Last edited by sai bing; 29 Dec 2018, 08:16.

                Comment


                • #9
                  I used id as the name of the ID variable. Perhaps in her real data it's in upper case. Code needs to be adapted to the specific data set; since example data was not given, I made some assumptions about what variable names are.

                  Read -help levelsof- and you will see that it contains an option, -local()- that permits you to specify the name of a local macro you want to create that will hold the list of levels. You can choose whatever name you like for that. I chose to call it ids because it would contain a list of values of variable id.

                  Read -help foreach- and the corresponding section of the PDF manuals (there will be a link in blue near the top of the -help foreach- window). There you will find an explanation of how loops are done in Stata; it is too lengthy to describe here.

                  Comment

                  Working...
                  X