Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using foreach in regression

    Hello,

    I wish to run a few regressions over the same set of control variables and dependent variables.

    Independent Variables : var1 var2 var3
    Dependent Variables: var4 var5 var6

    I have to run 3 specifications as follows:

    Specification 1 : regress var4 var1 var2 var3 (Similarly with var5 and var6 as dependent variables too)
    Specification 2: regress var4 var1 var2 (Similarly with var5 and var6 as dependent variables too)
    Specification 3: regress var4 var1 var3 (Similarly with var5 and var6 as dependent variables too)


    I tried something like this for Specification 1 :

    local regressors1 var1 var2 var3
    global regressant var4 var5 var6

    foreach y of global regressant and foreach x of local regressor1 {
    reg `y' `x'
    }


    I even tried the second foreach within the first one, neither worked.


    But it did not work. Can you please tell me how to make two lists (macros) of X and Y variables? Also how to use outreg2 within foreach loop, need to keep appending!

  • #2


    Code:
    foreach y in var4 var5 var6 {
           regress `y' var1 var2 var3
           regress `y' var1 var2
           regress `y' var1 var3
    }
    I can't advise on outreg2 (SSC, as you are asked to explain). I don't ever use it.

    Comment


    • #3
      So basically there is no other way but to write the independent variables? A macro such as global or local cannot be used?

      Comment


      • #4
        Clearly you can use macros here.

        Here is a silly use of macros, for example:

        Code:
        local x1 var4 
        local x2 var5
        local x3 var6 
        with later code referring to such macros.

        Here is another one

        Code:
        local X1 var4 var5 
        local X2 var4 var6 
        local X3 var4 var5 var6
        with later code referring to such macros.

        If you can come up with shorter code, congratulations! Bear in mind, as above, that every macro you define requires one statement.

        The value of using macros is to make your code shorter, simpler and preferably both. I doubt that any further use of macros would do that here, but I am open to refutation.

        Comment


        • #5
          You should be able too double loop. Something like:


          local c c1 c2 c3
          foreach y in y1 y2 y3 {
          foreach x in x1 x2 x3 {
          reg `y' `x' `c'
          }
          }

          the local macro defines control variables.

          Comment


          • #6
            Brad: Referring back to #1

            Specification 1 : regress var4 var1 var2 var3 (Similarly with var5 and var6 as dependent variables too)
            Specification 2: regress var4 var1 var2 (Similarly with var5 and var6 as dependent variables too)
            Specification 3: regress var4 var1 var3 (Similarly with var5 and var6 as dependent variables too)
            your code doesn't deliver any of those as special cases. Leaving out the c's (I don't see any mentioned, but you are clearly right that there might be some in other examples) consider this:


            Code:
            foreach y in y1 y2 y3 {
                foreach x in x1 x2 x3 {
                    di "`y' `x'"
                }
            }
            
            
            
            y1 x1
            y1 x2
            y1 x3
            y2 x1
            y2 x2
            y2 x3
            y3 x1
            y3 x2
            y3 x3

            Comment


            • #7
              I guess I was just thinking he could use this as a framework for his model specifications. But I clearly didn't look at his model specifications as carefully as I should have. Probably one of those situations where getting the code correct takes longer than just specifying the models.

              Comment


              • #8
                Probably one of those situations where getting the code correct takes longer than just specifying the models.
                Indeed. People (me too) will often spend much more time trying to make rough code more elegant than they will ever save in computation. Still, you don't become a better programmer without experience in finessing code.

                Comment


                • #9
                  Let me add the following.

                  Perhaps the initial statement of the problem was oversimplified, and in the actual problem there are more than three dependent variables and more than three independent variables. Then the following (demonstrated on the simplified example) can reduce the typing burden but only if the variables x1 through x3 and x4 through x6 are present in the data in that order with no other variables intervening between x1 and x3 and between x4 and x6.
                  Code:
                  foreach y of varlist var4-var6 {
                         regress `y' var1-var3
                         regress `y' var1 var2
                         regress `y' var1 var3
                  }
                  See help varlist for details on the ways of specifying a list of variables briefly.

                  Comment


                  • #10
                    I often do this kind of thing, usually when there is something complicated I want to do after each model and only want to code that bit once.


                    Code:
                    * dependent vars
                    local dvar1 var4
                    local dvar2 var8 
                    
                    * independent vars
                    local ivars1 var1 var2 var3
                    local ivars2 var1 var5 var6
                    local ivars3 var2 var6 var8 var1 var3 
                    
                    * loops
                    forv d=1/2 {
                        forv i=1/3 {
                                 regress `dvar`d'' `ivars`i''
                        }
                    }
                    hth,
                    Jeph

                    Comment


                    • #11
                      Jeph's example makes an excellent point. His syntax would end up shorter and simpler if it avoided much repetition after each regression. But change the question, and the answer may indeed change. Taste is paramount here, but for what was asked in #1 I still prefer #2.

                      Comment


                      • #12
                        Hi,
                        I am not very familiar with looping. I tried to follow this thread and loop non-linear regression foreach of my id.


                        What I want to do eventually is run the following regressions and store the values (any ideas for commands? outreg2 maybe or not?).

                        nl (choice=1/(1+exp(-{m}*(reward-y*exp(-{r}*t))))), cluster(id) nolog iterate(100)
                        est sto m1
                        nl (choice=1/(1+exp(-{m}*(reward-y/(1+{r}*t))))), cluster(id) nolog iterate(100)
                        est sto m2
                        nl (choice=1/(1+exp(-{m}*(reward-{alpha}*y*exp(-{r}*t))))), cluster(id) nolog iterate(100) level(99)
                        est sto m3
                        nl (choice=1/(1+exp(-{m}*(reward-{alpha=1}*y*(1-(1-{theta=3})*{r}*t)^(1/(1-{theta=3})))))), cluster(id) nolog iterate(100) level(99)
                        est sto m4

                        I was using

                        forvalues i=1(1)166 {
                        nl (choice=1/(1+exp(-{m}*(reward-y*exp(-{r}*t))))) if id==`i'
                        }

                        but I am getting the invalid syntax.

                        P.S :
                        I used egen i= group(id) to group my ids.


                        Code:
                        * Example generated by -dataex-. To install: ssc install dataex
                        clear
                        input long id float(reward choice y)
                        10102   0 0 120
                        10102   0 0 300
                        10102 150 1 300
                        10102 160 1 240
                        10102   0 0  30
                        10102 120 1 240
                        10102 250 1 300
                        10102  10 0  60
                        10102  80 0 120
                        10102 200 0 300
                        10102  15 0  30
                        10102  30 0  60
                        10102 100 0 300
                        10102  60 0  60
                        10102  60 1 120
                        10102  40 0 120
                        10102  50 1 300
                        10102  20 1  60
                        10102   5 1  30
                        10102 240 1 240
                        10102  20 1 120
                        10102 120 0 240
                        10102  60 1  60
                        10102  20 1  30
                        10102  40 1  60
                        10102 100 1 120
                        10102 200 0 240
                        10102  80 0 240
                        10102  15 0  30
                        10102  20 1  30
                        10201  50 1  60
                        10201  40 0  60
                        10201  40 0  60
                        10201  50 1 300
                        10201 200 1 300
                        10201  20 1  30
                        10201   0 0 300
                        10201  60 1  60
                        10201  80 0 240
                        10201 160 1 240
                        10201 300 1 300
                        10201  20 0  30
                        10201  50 1  60
                        10201 200 1 240
                        10201  25 1  30
                        10201  60 0  60
                        10201   5 1  30
                        10201  60 0 120
                        10201 120 1 120
                        10201 120 1 240
                        10201   0 0  30
                        10201  80 1 120
                        10201  15 0  30
                        10201 120 0 240
                        10201 100 0 120
                        10201 250 0 300
                        10201 100 0 120
                        10201 150 0 300
                        10201 120 1 120
                        10201 160 0 240
                        10403   0 0  60
                        10403  20 0 120
                        10403   0 0 300
                        10403   0 0 240
                        10403  80 1 240
                        10403  10 1  30
                        10403   5 0  30
                        10403  50 1 300
                        10403  20 1 120
                        10403  80 1 240
                        10403  20 0  60
                        10403  10 0  60
                        10403  20 1 120
                        10403  10 1  60
                        10403  40 1 120
                        10403 250 0 300
                        10403   0 0 120
                        10403  40 0 240
                        10403  50 1 300
                        10403  30 1  60
                        10403  15 1  30
                        10403   5 1  30
                        10403   0 0  30
                        10403 300 1 300
                        10403  40 0 240
                        10403   0 0 120
                        10403  20 1  60
                        10403   0 0 300
                        10403  10 0  30
                        10403  40 1 240
                        10601   0 0  30
                        10601   0 0 240
                        10601   0 0  60
                        10601  40 1 240
                        10601   0 0  30
                        10601  80 1 240
                        10601   0 0 120
                        10601   0 0  30
                        10601  20 1 120
                        10601  20 1 120
                        end

                        Comment


                        • #13
                          I don't think this is anything to do with loops, except you are evidently using the wrong identifier. You need to try

                          Code:
                          forvalues i=1(1)5 {
                          nl (choice=1/(1+exp(-{m}*(reward-y*exp(-{r}*t))))) if i==`i'
                          }
                          As it is i, not id, that has values from 1 to 166.

                          That said, you have lots of duplicate values as

                          Code:
                          duplicates report
                          duplicates list
                          will show you. I can't try your code as there are no data for t in your example, but nl can be awkward at the best of times and is not less awkward with 2 or 3 parameters to estimate and very small samples. capture will catch failures to converge and there might be many.

                          If that works, then you can try the whole lot, perhaps something like this:

                          Code:
                          gen m = .
                          gen r = .
                          
                          forvalues i=1/166 {
                          capture nl (choice=1/(1+exp(-{m}*(reward-y*exp(-{r}*t))))) if i==`i'
                          quietly if _rc == 0 {
                                mat b = e(b)
                                replace m = b[1,1] if i == `i'
                                replace r = b[1, 2] if i == `i'
                          }
                          }
                          outreg2 is from SSC, as you are asked to explain. I don't have experience with it to allow useful comment.
                          Last edited by Nick Cox; 10 Jun 2019, 02:31.

                          Comment


                          • #14
                            Hi,

                            I have a similar question, but involves a bit more complexity. First of all, I am trying to build a code which loops a number of regressions for a firm. Please see my attached data file for an example.
                            Namely, I want the code to regress the data from t-14 until and including t-5 to forecast t+5. In all of these regressions, the dependent variable is E_Next_T and all the independent variables are NegE, E, NegE_Times_E, B and TACC.
                            Then I want the code to regress the data from t-13 until and including t-4 to forecast t+4, regress t-12 until and including t-3 to forecast t+3, regress t-11 until and including t-2 to forecast t+2, regress t-10 until and including t-1 to forecast t+1. This way I will find a wide number of regressors, and I would like Stata to store every factor for every independent variable for T+1, t+2, t+3, t+4 and t+5 (I need them later to backtest whether E_Next_T is similar to the real values). So, I need to obtain the independent variables in vectors.

                            Then, I want Stata to do this exact same process for every firm.

                            Could anyone please help me? I've provided a sample data set in the attachments. I could not upload a .dta file, so the data is saved in a csv file

                            Thank you in advance!

                            Attached Files

                            Comment


                            • #15
                              Hi All,

                              I rewrote the code that had provided but wanted to know if there was a way to modify it to add an outreg2 option based on the independent variable

                              Code:
                              * dependent vars
                              local dvar1 paidself
                              local dvar2 unpaid 
                              local dvar3 anywork
                              
                              *Independent vars
                              
                              local ivars1 treat##var1
                              local ivars2 treat##var2
                              local ivars3 treat##var3 
                              local ivars4 treat##var4
                              local ivars5 treat##var5
                              local ivars6 treat##var6
                              
                              forv d=1/3 {
                                  forv i=1/6 {
                                           regress `dvar`d'' `ivars`i''
                                  }
                              }
                              So at some point after regress I want to make the following edits to the loops, let me know if this makes sense:

                              Code:
                              
                              forv d=1 {
                                  forv i=1/6 {
                                           regress `dvar`d'' `ivars`i'' $covariates1
                                           outreg2 using `ivars`i''.doc, replace ctitle(`ivars`i'') label addtext(Individual Controls, Yes) // this will help generate a new document 
                                  }
                              }
                              
                              forv d=2/3 {
                                  forv i=1/6 {
                                           regress `dvar`d'' `ivars`i'' $covariates1
                                           outreg2 using `ivars`i''.doc, append ctitle(`ivars`i'') label addtext(Individual Controls, Yes) //this will append additional specifications
                                  }
                              }
                              Code:
                              
                              


                              Through these edits, I hope to export these regressions in the format where I would have six documents (1 per independent variable local) with the 3 specifications per document accounting for all three dependent variable.

                              However, I keep getting the error invalid syntax r(198). Do let me know where the error is. Thanks!

                              Comment

                              Working...
                              X