Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • looping regressions with different versions

    Hello statausers,

    I am working on generating regression tables with nested loop - for different versions - and it is also one of the follow-up questions: https://www.statalist.org/forums/for...t-observations.
    In particular, I would like to export regression tables in excel file in which it contains: Columns (1) to (4) are so-called "Aggregated" version and columns from (5) until the end is either version 0 or 1.
    Aggregated version is when every country is included in the regression, 0 is when one country is excluded from the dataset at the time (for e.g. if countrylist is USA, UK, FRA, CAN then first set is excluding only USA, second set is UK, and so on), and 1 is when we are regressing one-by-one country (for e.g. only USA, then UK, then FRA so on).
    To implement this, i used the following command:
    Code:
    gl folder folderdirectory
    cap encode country, gen(cty)
    
    loc job var1 var2 var3 var4
    
    loc analysislevel 0        // 1 if one-by-one (country); 0 if excluding one country
    
    loc iteration 1
    
    levelsof country, local(countries)
    
    foreach c in "" `countries' {
    
        * Excel file replace/append option
        if `iteration' == 1 local replace_op replace
        if `iteration' > 1  local replace_op append
    
        * Regression versions: Aggregate + 0/1
        if missing("`c'") {        // Aggregate
            loc regif
            loc ex_lab
            loc cn Aggregate
        }
        if `analysislevel' {     // 1; one-by-one
            loc regif "if country =="`c'""
            loc ex_lab
            loc cn `c'
        }
        else {                    // 0; excluding
            loc regif "if country !="`c'""
            loc ex_lab "Excluding, "`c'","
            loc cn `c'
        }
        
        noi di _n "=================       Country: `cn'     ==================="
    
    
        foreach var of local job {
            
            * Basic setup for regression
            loc model_name "5-year `var'"
            loc FElbl13     i.cty#i.year i.cty#i.ind_a38
            loc FElbl24        i.cty#i.ind_a38#i.year
            loc FE13        "C-I C-Y"
            loc FE24         "C-I-Y"
            loc cl_CI        i.cty#i.ind_a38
    
            * Regression (Columns 1 to 4)
            forval col = 1/4 {
                
                * Setting regressors for columns 1&2 // 3&4
                if inlist (`col', 1, 2)    local xvars ib3.p_LogLP_VA LogL_av     // Columns 1&2
                    else local xvars F1LogLP_VA_av ib3.p_LogLP_VA `var'_1L LogL_av // Columns 3&4
            
                * Setting FE/labels for columns 1&3 // 2&4
                if inlist(`col', 1, 3)    local fe `FElbl13' local fe_table `FE13'
                    else local fe `FElbl124' local fe_table `FE24'
                
                
                * REGRESSION
                capture reghdfe `var'_5L `xvars' `regif' [aw=weightvar], a(`fe') vce(cluster `cl_CI')
                
                if c(rc) == 0 {
                    
                    levelsof country if e(sample), clean local(countrylist)
                    local n_countries = `r(r)'
                    levelsof ind_a38 if e(sample)
                    local n_inds = `r(r)'
    
                    * Export to excel file
                    outreg2 using "${folder}/Baseline/`analysislevel'_`var'.xlsx", `replace_op' ctitle(`model_name') label ///    
                            addtext(Fixed effects, `fe_table', Country List, `countrylist', `ex_lab' Countries, `n_countries', Industries, `n_inds')
                }
                else if !inlist(c(rc), 2000, 2001) {
                    display as error `"Unexpected regression error: var = `var', country = `c'"'
                }
            
            }
        
        } // end of job loop
        
        local ++iteration
        
    } // end of countries loop
    However, i get the following error:
    Code:
    =================    Country:    ===================
    inlist not found
    r(111);
    I am unsure where to find the issues.. and to fix them eventually.
    I understand that my code is a bit complicated but hope the question is clear!

    Thanks a lot in advance.

  • #2
    Code:
     
     inlist (`
    should be

    Code:
    inlist(
    as otherwise Stata is looking for a variable or scalar called inlist -- and reporting correctly that none such can be found.

    Debugging is often difficult. I spent about two hours at the weekend looking for a bug before I found it. In this case, the problem was presumably with code using inlist() so I looked for that.

    Comment


    • #3
      thanks Nick for your reply.
      I changed to inlist() but now it shows error:
      Code:
       
       Unexpected regression error: var = `var', country = `c'
      for every country and every variable. it seems the code is not working as wanted..

      Comment


      • #4
        I can't help with that one. You could at least tell us which command produced that error.

        Comment


        • #5
          That error message almost certainly arises from
          Code:
           else if !inlist(c(rc), 2000, 2001) {
               display as error `"Unexpected regression error: var = `var', country = `c'"'
           }
          although it is very strange that it actually prints out `var' and `c' instead of the values of var and c when the error was triggered.

          The way that part of the code is reached is when the -capture reghdfe- command encounters an error other than an insufficient estimation sample size. So you need to figure out what is going wrong with -reghdfe-. To do that, put
          Code:
          set tracedepth 1
          set trace on
          immediately before the -capture reghdfe- command. (You probably want to put -set trace off- immediately after the -capture reghdfe- to avoid printing out a lot of other trace information that probably will not be helpful in this context.) That way Stata will show you what that command looks like when all the local macros in it have been interpreted. Also, change -capture- to -capture noisily- so that Stata will show the error message that -reghdfe- produces. You will probably find some non-existent variables mentioned, or something of that nature. In any event, the answer will almost surely be found in the results of that trace.
          Last edited by sladmin; 10 Jun 2025, 11:07. Reason: fix code tags

          Comment


          • #6
            sorry, in #3, i was mentioning that the error shows continuously like
            Unexpected regression error: var = var1, country = country1
            Unexpected regression error: var = var2, country = country2
            etc

            Comment


            • #7
              Anne-Claire Jo Thanks for that clarification. That means your reghdfe command is not working as expected, as mentioned in #5. A good starting point to understand what is going on, would be remove the capture in front of that command, and run your do-file. This will cause the reghdfe command to fail and the error messages from that should be informative. Also add the set tracedepth and set trace commands appropriately, as suggested in #5.

              If you still do not understand what is going on, please show us the output of that trace.

              Comment


              • #8
                Hello,
                I haven't followed all of your code, but this looks wrong
                Code:
                 
                             * Setting FE/labels for columns 1&3 // 2&4             if inlist(`col', 1, 3)    local fe `FElbl13' local fe_table `FE13'                 else local fe `FElbl124' local fe_table `FE24'
                Try
                Code:
                if inlist(`col', 1, 3) {
                local fe `FElbl13'
                local fe_table `FE13'
                } 
                else {
                local fe `FElbl124'
                local fe_table `FE24'
                }

                Comment


                • #9
                  thank you all for the reply!

                  I made a bit of modification as below:
                  Code:
                    
                   gl folder folderdirectory cap encode country, gen(cty)  loc job var1 var2 var3 var4 
                  loc analysislevel 0        // 1 if one-by-one (country); 0 if excluding one country
                  
                  loc iteration 1
                  
                  levelsof country, local(countries)
                  
                  foreach c in "" `countries' {
                  
                      * Excel file replace/append option
                      if `iteration' == 1 local replace_op replace 
                      if `iteration' > 1  local replace_op append
                  
                      * Regression versions: Aggregate + 0/1
                      if `analysislevel' {     // 1; one-by-one
                          loc regif "if country =="`c'""
                          loc ex_lab 
                          loc cn `c'
                      }
                      else {                    // 0; excluding
                          loc regif "if country != "`c'""
                          loc ex_lab "Excluding, "`c'","
                          loc cn `c'
                      }
                      
                      if missing("`c'")  {        // Aggregate
                          loc regif 
                          loc ex_lab 
                          loc cn Aggregate
                      }
                      
                      noi di _n " =================       Country: `cn'         =================== "
                  
                  
                      foreach var of local job {
                          
                          * Basic setup for regression
                          loc model_name "5-year `var'"
                          loc FElbl13     i.cty#i.year i.cty#i.ind_a38
                          loc FElbl24        i.cty#i.ind_a38#i.year
                          loc FE13        "C-I C-Y"
                          loc FE24         "C-I-Y"
                          loc cl_CI        i.cty#i.ind_a38
                  
                          * Regression (Columns 1 to 4)
                          forval col = 1/4 {
                              
                              * Setting RHS for columns 1&2 // 3&4
                              if inlist(`col', 1, 2)    ///
                                 local xvars ib3.p_LogLP_VA LogL_av     // Columns 1&2
                              else                     ///
                                 local xvars F1LogLP_VA_av ib3.p_LogLP_VA `var'_1L LogL_av // Columns 3&4
                          
                              * Setting FE/labels for columns 1&3 // 2&4
                              if inlist(`col', 1, 3) {
                                 local fe `FElbl13'   
                                 local fe_table `FE13'
                              }
                              else {                    
                                 local fe `FElbl124'  
                                 local fe_table `FE24'
                              }
                               
                              * REGRESSION
                              capture reghdfe `var'_5L `xvars' `regif' [aw=weightvar], a(`fe') vce(cluster `cl_CI')
                              
                              if c(rc) == 0 {
                                  
                                  levelsof country if e(sample), clean local(countrylist)
                                  local n_countries = `r(r)'
                                  levelsof ind_a38 if e(sample)
                                  local n_inds = `r(r)'
                  
                                  * Export to excel file
                                  outreg2 using "${folder}/Baseline/`analysislevel'_`var'.xls", `replace_op' ctitle(`model_name') label ///    
                                          addtext(Fixed effects, `fe_table', Country List, `countrylist', `ex_lab' Countries, `n_countries', Industries, `n_inds')
                              }
                              else if !inlist(c(rc), 2000, 2001) {
                                  display as error `"Unexpected regression error: var = `var', country = `c'"'
                              }
                          
                          }
                      
                      } // end of job loop
                      
                      local ++iteration
                      
                  } // end of countries loop
                  and it is fixed for the issue with unexpected error message, but still I am encountering issue for the output table in the excel file.
                  Columns (1) to (4) are supposed to be the regressions for the set of aggregated version (which is all the countries / i.e. without if condition in the regression) and from (5) until the end should be the regressions for the set of versions either 0 /1.
                  However, i see that in the output table, column (1) produces the last part of regression #4 (i.e. written column (4) in the code) and from columns (2) until the end are the set of versions either 0 or 1.
                  In other words, columns (1)-(3) of the aggregated version is deleted and only column (4) is remained for this version; and the rest are the set of regressions for version 0/1.
                  Because of the confidentiality issue, I cannot provide the output table but i hope the issue mentioned here is clear for everyone!
                  Could someone help me finding and solving the issue here?

                  Comment


                  • #10
                    If you keep sending outreg2 output to the same sheet of the same Excel workbook, it will replace the prior contents of that sheet. You need to modify your code so you produce a single table with all columns as desired, and then output that full table to the Excel sheet in a single call to outreg2.

                    Comment


                    • #11
                      Hemanshu Kumar sorry Im a bit confused.. how should i change?

                      Comment


                      • #12
                        Sorry for bringing this post up again, but i havent still fixed the issue on exporting output table to excel file. Aggregate versions (first 4 regressions) are not exported and only the last regression (column 4) of aggregate version is exported with the rest of other versions (either 0 or 1). Could someone help me fixing this problem to export ALL the desired output, please?

                        Comment


                        • #13
                          The problem is that your loop is causing Stata to overwrite the earlier results with the new results each time through the loop, so when you are done, only the results of the last regression "survive." As I do not use -outreg2- myself, I cannot give you specific advice on how to fix this. However, given that -outreg2- has been around for a long time and is, as best I can tell from following Statalist, wildly popular, I am pretty sure it has a way around this. Check the -outreg2- help file looking for some option that lets you specify a list of estimates to be output. You can build that list of estimates inside the loop, and then move the -outreg2- command outside the loop. So something like this:
                          Code:
                          gl folder folderdirectory cap encode country, gen(cty)  loc job var1 var2 var3 var4 
                          loc analysislevel 0        // 1 if one-by-one (country); 0 if excluding one country
                          
                          loc iteration 1
                          
                          levelsof country, local(countries)
                          
                          local estimates list
                          
                          foreach c in "" `countries' {
                          
                              * Excel file replace/append option
                              if `iteration' == 1 local replace_op replace 
                              if `iteration' > 1  local replace_op append
                          
                              * Regression versions: Aggregate + 0/1
                              if `analysislevel' {     // 1; one-by-one
                                  loc regif "if country =="`c'""
                                  loc ex_lab 
                                  loc cn `c'
                              }
                              else {                    // 0; excluding
                                  loc regif "if country != "`c'""
                                  loc ex_lab "Excluding, "`c'","
                                  loc cn `c'
                              }
                              
                              if missing("`c'")  {        // Aggregate
                                  loc regif 
                                  loc ex_lab 
                                  loc cn Aggregate
                              }
                              
                              noi di _n " =================       Country: `cn'         =================== "
                          
                          
                              foreach var of local job {
                                  
                                  * Basic setup for regression
                                  loc model_name "5-year `var'"
                                  loc FElbl13     i.cty#i.year i.cty#i.ind_a38
                                  loc FElbl24        i.cty#i.ind_a38#i.year
                                  loc FE13        "C-I C-Y"
                                  loc FE24         "C-I-Y"
                                  loc cl_CI        i.cty#i.ind_a38
                          
                                  * Regression (Columns 1 to 4)
                                  forval col = 1/4 {
                                      
                                      * Setting RHS for columns 1&2 // 3&4
                                      if inlist(`col', 1, 2)    ///
                                         local xvars ib3.p_LogLP_VA LogL_av     // Columns 1&2
                                      else                     ///
                                         local xvars F1LogLP_VA_av ib3.p_LogLP_VA `var'_1L LogL_av // Columns 3&4
                                  
                                      * Setting FE/labels for columns 1&3 // 2&4
                                      if inlist(`col', 1, 3) {
                                         local fe `FElbl13'   
                                         local fe_table `FE13'
                                      }
                                      else {                    
                                         local fe `FElbl124'  
                                         local fe_table `FE24'
                                      }
                                       
                                      * REGRESSION
                                      capture reghdfe `var'_5L `xvars' `regif' [aw=weightvar], a(`fe') vce(cluster `cl_CI')
                                      
                                      if c(rc) == 0 {
                                          
                                          levelsof country if e(sample), clean local(countrylist)
                                          local n_countries = `r(r)'
                                          levelsof ind_a38 if e(sample)
                                          local n_inds = `r(r)'
                                          
                                          local estimates_list `estimates_list' `var'_`c'
                          
                                          * Export to excel file --MOVED TO FOLLOW LOOPS
                                     }
                                      else if !inlist(c(rc), 2000, 2001) {
                                          display as error `"Unexpected regression error: var = `var', country = `c'"'
                                      }
                                  
                                  }
                              
                              } // end of job loop
                              
                              local ++iteration
                              
                          } // end of countries loop
                          
                          * Export to excel file THIS COMMAND MUST BE MODIFIED TO EXPORT ALL OF THE ESTIMATES
                          * IN LOCAL MACRO estimates_list
                          //  INSERT MODIFIED outreg2 COMMAND HERE
                          If -outreg2- actually can't do that, which I would find very surprising, and if you are using Stata version 18 or later, use the -etable- command instead of -outreg2-. It definitely has an -estimates()- option that will accept a list of stored estimates, and it also has an -export()- option that lets you send the table to Excel (or several other document formats.) Read -help etable- to learn how to tailor the output to the particular statistics and display formats you want.

                          Comment


                          • #14
                            Looking more closely at your original code, I see now that you actually want a separate file for each value of `var' (job). To do this, a slightly different arrangement of the code, which places the country loop inside the job loop is needed. So, something like this:

                            Code:
                            gl folder folderdirectory cap encode country, gen(cty)  loc job var1 var2 var3 var4 
                            loc analysislevel 0        // 1 if one-by-one (country); 0 if excluding one country
                            
                            loc iteration 1
                            
                            levelsof country, local(countries)
                            
                            
                            foreach var of local job {
                                local estimates list
                                
                                foreach c in "" `countries' {
                            
                                * Excel file replace/append option
                                if `iteration' == 1 local replace_op replace 
                                if `iteration' > 1  local replace_op append
                            
                                * Regression versions: Aggregate + 0/1
                                if `analysislevel' {     // 1; one-by-one
                                    loc regif "if country =="`c'""
                                    loc ex_lab 
                                    loc cn `c'
                                }
                                else {                    // 0; excluding
                                    loc regif "if country != "`c'""
                                    loc ex_lab "Excluding, "`c'","
                                    loc cn `c'
                                }
                                
                                if missing("`c'")  {        // Aggregate
                                    loc regif 
                                    loc ex_lab 
                                    loc cn Aggregate
                                }
                                
                                noi di _n " =================       Country: `cn'         =================== "
                            
                            
                                    
                                    * Basic setup for regression
                                    loc model_name "5-year `var'"
                                    loc FElbl13     i.cty#i.year i.cty#i.ind_a38
                                    loc FElbl24        i.cty#i.ind_a38#i.year
                                    loc FE13        "C-I C-Y"
                                    loc FE24         "C-I-Y"
                                    loc cl_CI        i.cty#i.ind_a38
                            
                                    * Regression (Columns 1 to 4)
                                    forval col = 1/4 {
                                        
                                        * Setting RHS for columns 1&2 // 3&4
                                        if inlist(`col', 1, 2)    ///
                                           local xvars ib3.p_LogLP_VA LogL_av     // Columns 1&2
                                        else                     ///
                                           local xvars F1LogLP_VA_av ib3.p_LogLP_VA `var'_1L LogL_av // Columns 3&4
                                    
                                        * Setting FE/labels for columns 1&3 // 2&4
                                        if inlist(`col', 1, 3) {
                                           local fe `FElbl13'   
                                           local fe_table `FE13'
                                        }
                                        else {                    
                                           local fe `FElbl124'  
                                           local fe_table `FE24'
                                        }
                                         
                                        * REGRESSION
                                        capture reghdfe `var'_5L `xvars' `regif' [aw=weightvar], a(`fe') vce(cluster `cl_CI')
                                        
                                        if c(rc) == 0 {
                                            
                                            levelsof country if e(sample), clean local(countrylist)
                                            local n_countries = `r(r)'
                                            levelsof ind_a38 if e(sample)
                                            local n_inds = `r(r)'
                                            
                                            local estimates_list `estimates_list' `var'_`c'
                            
                                            * Export to excel file --MOVED TO FOLLOW LOOPS
                                       }
                                        else if !inlist(c(rc), 2000, 2001) {
                                            display as error `"Unexpected regression error: var = `var', country = `c'"'
                                        }
                                    
                                    }
                                
                                } // end of country loop
                                
                                * Export to excel file THIS COMMAND MUST BE MODIFIED TO EXPORT ALL OF THE ESTIMATES
                                * IN LOCAL MACRO estimates_list
                                //  INSERT MODIFIED outreg2 COMMAND HERE    
                                local ++iteration
                                
                            } // end of job loop

                            Comment


                            • #15
                              thanks Clyde Schechter for your help!
                              I have modified the code based on your suggestion in #14 (but leaving the outreg2 code same as previous):
                              Code:
                              outreg2 using "${folder}/Baseline/`analysislevel'_Baseline.xls", `replace_op' ctitle(`model_name') label ///    
                                                  addtext(Fixed effects, `fe_table', Country List, `countrylist', `ex_lab' Countries, `n_countries', Industries, `n_inds')
                              and unfortunately im still having same issue:
                              for instance, if i say analysislevel is 0, which is excluding one country at a time analysis, then my output is something like:
                              (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13)
                              VARIABLES var1 var1 var1 var1 var1 var1 var1 var1 var1 var1 var1 var1 var1
                              Mean of F1LogLP_VA (unweighted)
                              p_LogLP_VA = 1, 0-10 -
                              ( )
                              p_LogLP_VA = 2, 10-40 - *
                              ( )
                              p_LogLP_VA = 4, 60-90 *
                              )
                              p_LogLP_VA = 5, 90-100 *
                              (
                              JCR2_1L
                              Mean of LogL (unweighted) -
                              (
                              Constant
                              (
                              Observations
                              R-squared
                              Fixed effects C-I-Y C-I C-Y C-I-Y C-I C-Y C-I-Y C-I C-Y C-I-Y C-I C-Y C-I-Y C-I C-Y C-I-Y C-I C-Y C-I-Y
                              Country List CAN FIN FRA LTU PRT SVN FIN FRA LTU PRT SVN FIN FRA LTU PRT SVN FIN FRA LTU PRT SVN FIN FRA LTU PRT SVN CAN FRA LTU PRT SVN CAN FRA LTU PRT SVN CAN FRA LTU PRT SVN CAN FRA LTU PRT SVN CAN FIN LTU PRT SVN CAN FIN LTU PRT SVN CAN FIN LTU PRT SVN CAN FIN LTU PRT SVN
                              Countries 6 5 5 5 5 5 5 5 5 5 5 5 5
                              Industries 22 22 22 22 22 22 22 22 22 22 22 22 22
                              Excluding CAN CAN CAN CAN FIN FIN FIN FIN FRA FRA FRA FRA
                              (I'm erasing the coeff because of confidentiality issue)
                              where first 4 columns (columns 1 to 4) should be 4 aggregated versions, and then starting from 5 until the end, it would be excluding CAN, FIN, FRA and so on.
                              But still as you can see the table, only the column (1) survived for aggregated version.

                              Comment

                              Working...
                              X