Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get the predicted value of each group?

    Dear all,

    I want get the predicted value of each regression of each group? Here is an example.

    Code:
    webuse grunfeld,clear
    
    forval i = 1/10{
        reg invest kstock if company == `i'
        predict y`i' if company == `i'
        predict e`i' if company == `i',res
    
    }
    egen y = rowtotal(y1-y10)
    egen e = rowtotal(e1-e10)
    y and e is what I wanted. But this is just a simple example. In my real data, I have 12000 companies.

    Anyone have another simple method to get the predicted y and e of each group?




    Best regards.

    Raymond Zhang
    Stata 17.0,MP

  • #2
    If you just want to avoid having so many new variables, this will do:

    Code:
    webuse grunfeld,clear
    
    gen y = .
    gen e = .
    
    forval i = 1/10{
        reg invest kstock if company == `i'
        predict _y if company == `i'
        predict _e if company == `i', res
        replace y = _y if company == `i'
        replace e = _e if company == `i'
        drop _y _e
    }

    Comment


    • #3
      Originally posted by Hemanshu Kumar View Post
      If you just want to avoid having so many new variables, this will do:

      Code:
      webuse grunfeld,clear
      
      gen y = .
      gen e = .
      
      forval i = 1/10{
      reg invest kstock if company == `i'
      predict _y if company == `i'
      predict _e if company == `i', res
      replace y = _y if company == `i'
      replace e = _e if company == `i'
      drop _y _e
      }
      Thanks so much. Do you have some faster way to solve this question? It is very slow to run each regression using `forvals`.
      Best regards.

      Raymond Zhang
      Stata 17.0,MP

      Comment


      • #4
        Code:
        reg invest i.company i.company#c.kstock
        predict y_jr
        predict e_jr, res
        Interacting all rhs variables with the company identifier (i.company#) combines all company-specific regressions in one. Yet, not sure if this works in your application (thousands of companies) because this will result in a huge number of rhs variables.
        Best wishes,
        Harald

        Comment


        • #5
          On #2 and #3: For the specific problem stated this avoids an explicit loop using rangestat from SSC.


          Code:
          . webuse grunfeld, clear
          
          . rangestat (reg) invest kstock, int(company 0 0)
          
          . gen double predicted = b_cons + b_kstock * kstock
          
          . gen double residual = invest - predicted
          
          .
          But rangestat doesn't extend to more complicated regressions; its implementation of regression is token and its main focus is directed elsewhere.

          asreg (also SSC) is specifically for regressions. I haven't tried using it.

          statsby is an official command of importance, while runby from SSC is reportedly often faster.

          When people complain that doing many, many regressions is slow they are always right given their expectations. But whether getting hundreds or thousands of regressions is a good strategy anywhere is not so clear. What are you going to do with the results? The question is simple, but the answer may not be, particularly if each regression is based on a small sample.

          Comment

          Working...
          X