Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping Regression with Output

    Hi, i believe that the solution of my problem could help many of the stata users.

    I have one dependent variable A, 100 independent variables B1, B2....B20 and 10 control variables C1, C2....C10. Please note that the name of independent and control variables is in random form and not in any order.

    First, i wants a looping regression, where, in each single regression equation I want to use same dependent variable and same control variables (C1 to C10), but 100 independent variables individually one by one in each regression equation (total 100 regression equations).

    Second, i need to save the coefficients (t-statistic and p-value) for only the independent variables from each regression equation in any of the output format (word excel). May be using outreg2, but not sure.

    Third, (similar to first), want to run again a looping regression where same dependent variable, same control variables but here each regression equation should include 5 independent variables instead of one independent variable in each regression. Like, first regression includes first 5 independent variables and then second regression equation includes second 5 independent variables, with no change in dependent and control variables. Finally, save the coefficients for the independent variables in any output format.

    Thanks for the help.

  • #2
    Here are some ways to accomplish the looping to run the regressions. I will leave it to others to suggest ways to store the results. I might suggest outputting to comma or tab delimited format so as to be easily read into a spreadsheet. Given the number of models, it will be difficult to paginate in a Word doc. You will need to decide whether you want a horizontal or vertical layout for your model information.


    Code:
    *list your independent variables in a local macro
    *since you say they are in no particular order and the name is random, 
    *I think you have no choice but to list them all out:
    
    local ivs B1 B2 B3 B4 B5 ... B100
    
    *list your control variables in a local macro:
    
    local cvs C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
    
    ***Set 1: One IV at a time
    
    foreach IV of varlist ivs {
         reg A `cvs' `IV'
         *here's where you would save coefficients, etc
       }
    
    ***Set 2: First 5 IVs, Second 5 IVs, etc
    
    forvalues i=1(5)100 {
        forvalues j=1/5 {
             local v`j' : word `i' of `IV'
             local ++i
            }
    
         reg A `cvs' `v1' `v2' `v3' `v'4 `v5'
         *here's where you would save coefficients, etc
       }
    Stata/MP 14.1 (64-bit x86-64)
    Revision 19 May 2016
    Win 8.1

    Comment


    • #3
      So here's how I would do this for Set 1. I would create a temporary file to hold the results, and each time through the loop I would save them. The coefficients, after regression, are in _b[]. You can calculate t and p directly, but it is simpler to use the -test- command. -test- returns the p-value and an F statistic, but the F is just the square of t. So the code looks like this (the parts I have added to Carole's code are in bold face:

      Code:
      *list your independent variables in a local macro
      *since you say they are in no particular order and the name is random, 
      *I think you have no choice but to list them all out:
      
      local ivs B1 B2 B3 B4 B5 ... B100
      
      *list your control variables in a local macro:
      
      local cvs C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
      
      // SET UP THE TEMPORARY FILE
      tempfile coefficients
      capture postutil clear
      postfile handle str32 IV float coeff t p using `coefficients'
      
      ***Set 1: One IV at a time
      
      foreach IV of varlist ivs {
           reg A `cvs' `IV'
           *here's where you would save coefficients, etc
           test `IV'
           post handle ("`IV'") (_b[`IV']) (sqrt(r(F))) (r(p))
         }
       postclose handle
      The approach to set 2 is analogous, except that you have more coefficients, t's, and p's to store. I would set up a new, separate tempfile for that purpose, but the code is entirely similar, just with 5 variables instead of 1.

      After all the regressions have run, then you can -use `coefficients'- and -save- it as a regular Stata data file, or export to Excel, or -list- it, or whatever you want to do with it.

      Comment


      • #4
        Yes, there were a few typos. In fact, Set 1 should not have worked with that code either (I am referring to the code you re-post in #5). The text in red differs from the original:

        Code:
        local ivs B1 B2 B3 B4 B5 ... B100
        
        *list your control variables in a local macro:
        
        local cvs C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
        
        ***Set 1: One IV at a time
        
        foreach IV of varlist `ivs' {
        reg A `cvs' `IV'
        *here's where you would save coefficients, etc
        }
        
        ***Set 2: First 5 IVs, Second 5 IVs, etc
        
        forvalues i=1(5)100 {
        forvalues j=1/5 {
        local v`j' : word `i' of `ivs'
        local ++i
        }
        
        reg A `cvs' `v1' `v2' `v3' `v4' `v5'
        *here's where you would save coefficients, etc
        }
        Stata/MP 14.1 (64-bit x86-64)
        Revision 19 May 2016
        Win 8.1

        Comment


        • #5
          So Set 2 is really just like Set 1, except that you need to loop over the five independent variables inside a loop over the IVs that advances 5 variables at a time.

          Code:
          // LOCAL MACROS ivs AND cvs DEFINED AS BEFORE
          // IF RUNNING IN SEPARATEDO- FILE, REPEAT THE DEFINITIONS HERE
          // IF RUNNING IN SAME FILE AS SET 1, NO NEED TO RE-DEFINE THEM
          
          // SET UP THE TEMPORARY FILE
          capture postutil clear
          tempfile coefficients5
          postfile handle5 str32 IV float coeff t p using `coefficients5'
          
          ***Set 2: Five IVs at a time
          
          forvalues i = 1(5)96 {
              // BUILD LIST OF 5 CONSECUTIVE IVS IN LOCAL MACRO PREDICTORS
              local predictors
              forvalues j = 0/4 {
                  local predictors `predictors' `:word `=`i'+`j'' of `ivs''
              }
                  
              reg A `cvs' `predictors'
              *here's where you would save coefficients, etc
              //    NOW TEST EACH PREDICTOR AND POST ITS RESULTS
              foreach p of local predictors {
                  test `p'
                  post handle5 ("`p'") (_b[`p']) (sqrt(r(F))) (r(p))
              }
          }
           postclose handle5
          When this is done, temporary file `coefficients5' will contain the results for all 100 variables, having been entered in batches of five. You can -use- it, or -save- it or whatever you want to do with it.

          Comment


          • #6
            Hi everyone,

            this is a really helpful solution to a problem I had today.
            I just wonder how I can do the loop regression for example for 2 or 3 independent variables. Clyde Schechter and
            Carole J. Wilson have described the solution for 5. What should be changed so I can do it for 2 or 3?

            Any help will be highly appreciated! Thanks a lot!

            Regards,
            Filip

            Comment


            • #7
              To be specific, let's say you want to do it for three independent variables. Change -forvalues i = 1(5)/96- to -forvalues i = 1(3)96-. And change -forvalues j = 0/4- to -forvalues j = 0/2-.

              Optionally, to keep the code clear for human understanding I would change coefficients5 and handle5 to coefficients3 and handle3--but these are cosmetic changes that do not affect what the code actually does.

              All of that said, it is generally a bad idea to copy code from some place and then run it before you understand how it works and what it does. The fact that you asked this questions tells me that you don't really grasp how the code works. So, before you crank out results that may or may not really be what you want, and that you will not be able to explain or defend if anybody has questions, invest time in reading this code over and understanding what every term in it means and does. And make sure you understand why I suggested these particular changes. Then look it over carefully and see if there are perhaps other differences between your situation and the one that Carole and I responded to previously that might warrant making other changes as well.

              Comment


              • #8
                Thank you very much! Also thank you for your advice! In fact I'm a beginner and need some more time to get into Stata.

                Comment

                Working...
                X