Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • regression with loop - ignore if insufficient observations

    Hello,

    i am using loop for running regression on multiple units (i.e. countries), however, it seems like some countries do not have sufficient observations to run the regression.
    I tried using commands as below, however, it still has same issue.
    I would like to run the regressions for those who have sufficient observations (i.e. ignore if insufficient obs).
    Can someone help me solve this issue?
    Code:
    loc job JCR NJCR JDR JCR2 NJCR2 JDR2
    
    levelsof country, local(countries)
    foreach c of local countries{
        
        if c(rc) == 0 {
            
        noi di _n "Country: `c'"
        
        foreach var of local job {
        
    loc model_name "5-year `var'"
    loc FElbl13     i.cty#i.year i.cty#i.ind_a38
    loc FElbl24        i.cty#i.ind_a38#i.year
    
    
    * Column (1)
    reghdfe `var'_5L ib3.p_LogLP_VA LogL_av if country == "`c'" [aw=weightvar], a(`FElbl13') vce(cluster i.cty#i.ind_a38) allbaselevels
        levelsof country if e(sample)
        local n_countries = `r(r)'
        levelsof ind_a38 if e(sample)
        local n_inds = `r(r)'
    
        outreg2 using "5y_`var'_reg.xls", append ctitle(`model_name') label ///    
                addtext(Fixed effects, `FElbl13', Countries, `n_countries', Industries, `n_inds') 
    
    * Column (2)
    reghdfe `var'_5L ib3.p_LogLP_VA LogL_av if country == "`c'" [aw=weightvar], a(`FElbl24') vce(cluster i.cty#i.ind_a38) allbaselevels
        levelsof country if e(sample)
        local n_countries = `r(r)'
        levelsof ind_a38 if e(sample)
        local n_inds = `r(r)'
    
        outreg2 using "5y_`var'_reg.xls", append ctitle(`model_name') label ///    
                addtext(Fixed effects, `FElbl13', Countries, `n_countries', Industries, `n_inds') 
    
    * Column (3)
    reghdfe `var'_5L F1LogLP_VA_av ib3.p_LogLP_VA `var'_1L LogL_av if country == "`c'" [aw=weightvar], a(`FElbl13') vce(cluster i.cty#i.ind_a38) allbaselevels
        levelsof country if e(sample)
        local n_countries = `r(r)'
        levelsof ind_a38 if e(sample)
        local n_inds = `r(r)'
    
        outreg2 using "5y_`var'_reg.xls", append ctitle(`model_name') label ///    
                addtext(Fixed effects, `FElbl13', Countries, `n_countries', Industries, `n_inds') 
    
    * Column (4)
    reghdfe `var'_5L F1LogLP_VA_av ib3.p_LogLP_VA `var'_1L LogL_av if country == "`c'" [aw=weightvar], a(`FElbl24') vce(cluster i.cty#i.ind_a38) allbaselevels
        levelsof country if e(sample)
        local n_countries = `r(r)'
        levelsof ind_a38 if e(sample)
        local n_inds = `r(r)'
    
        outreg2 using "5y_`var'_reg.xls", append ctitle(`model_name') label ///    
                addtext(Fixed effects, `FElbl13',  Countries, `n_countries', Industries, `n_inds') 
    
                
        }
    } // end of job loop
    
    else if !inlist(c(rc), 2000, 2001) {    
            display as error "Unexpected error in regression"
            exit c(rc)
        }
    }

  • #2
    You have placed the -if- command and the other accompanying machinery incorrectly, where they will not achieve their intended purpose. In addition, as you have no -capture-d command, you won't get c(rc) either. Do it like this:
    Code:
    loc job JCR NJCR JDR JCR2 NJCR2 JDR2
    
    levelsof country, local(countries)
    foreach c of local countries{
              
        noi di _n "Country: `c'"
        
        foreach var of local job {
        
            loc model_name "5-year `var'"
            loc FElbl13     i.cty#i.year i.cty#i.ind_a38
            loc FElbl24        i.cty#i.ind_a38#i.year
    
    
            * Column (1)
            capture reghdfe `var'_5L ib3.p_LogLP_VA LogL_av if country == "`c'" [aw=weightvar], a(`FElbl13') vce(cluster i.cty#i.ind_a38) allbaselevels
            if c(rc) == 0 {
                levelsof country if e(sample)
                local n_countries = `r(r)'
                levelsof ind_a38 if e(sample)
                local n_inds = `r(r)'
    
                outreg2 using "5y_`var'_reg.xls", append ctitle(`model_name') label ///    
                        addtext(Fixed effects, `FElbl13', Countries, `n_countries', Industries, `n_inds')
            }
           else if !inlist(c(rc), 2000, 2001) {
                display as error `"Unexpected regression error: var = `var', country = `c'"'
            }
            
            // USE THE SAME STRUCTURE FOR Columns (2-4)
    
        }
    
    }
    You need a -capture-d -reghdfe- command for each of your output Columns, and each of those has to then be followed by the other commands in a block guarded by an -if- command checking c(rc), and then followed by another block guarded by an -else if- command handling unexpected error results.

    I've spelled out all of it for your first column. You need to replicate exactly the same structure for the other columns yourself.

    I haven't scrutinized the different -reghdfe- commands, but at a glance they appear pretty similar to each other, and it may be possible to write a loop, nested inside the job loop, that handles them all. Even if that's possible, it might be better from the point of view of code readability and transparency to stay with your existing explicit spell out of the commands for all four columns. That's up to you.

    Comment


    • #3
      Clyde Schechter thanks a lot for your help, it's working very well.
      Indeed, I feel like my code is very much hard-coded but since they differ a bit I don't really have insights on using loops or local to code them efficiently..
      Do you have any ideas/suggestions?

      Comment


      • #4
        It's been a very long time since I've taught an introductory programming course, but my recollection of the experience is that for some people the idea of loops comes quickly and easily, feeling natural from the start, and for others it is a real stumbling block. Even for the latter group, with sufficient exposure and practice, it eventually "clicks." Once the concept of looping is clear, it carries over easily to any programming language. It's just a matter of learning the specific syntax that a programming language uses for loops.

        Loops are about repeating the "same" thing multiple times, possibly allowing that "same" thing to vary slightly depending on which time through the loop we are talking about. The thing (in Stata, usually a local macro) that tracks the times through the loop is known as the iterator. And loop code in any programming language always includes: setting the iterator to its starting value, some block of code that is repeated each time through the loop, perhaps with the code changing based on the current value of the iterator, a command that switches the iterator to its next value, or, if the iterator has reached the last value it is supposed to take on, leaving the loop.

        I don't want to make the bold claim that local macros are unique to Stata, as there may be other languages I am unaware of that use them, or something very like them. But using local macros can be very intimidating, because of the peculiar ` ' device for accessing their values, and the intimidating syntax that can arise when you have macros nested in macros and sometimes further nested inside quotes or compound double quotes. This can be intimidating, and local macros nested to a depth of several levels can be confusing even to experts.

        So, if you think you are going to be doing this kind of work for a while, you might want to take a course in, or teach yourself, some simpler programming language and master loops there, postponing confronting local macros until you are comfortable with looping in concept and can write loops easily in the other programming language. Among languages that are relatively simple to learn, at least to this extent, are Python and Visual Basic. I don't mean to imply that these languages are superficial or shallow: both of them have advanced programming constructs within them as well, but they have a simple core that is easy to learn and is a good preparation for learning other computer languages. You don't need to learn either of these languages in depth in order to get the fundamental concepts down pat.

        Or, to learn it directly in Stata, you might try StataCorp's Net Courses. I found these very helpful when I was a Stata beginner, progress from the totally novice level to the more advanced, so you can proceed as far as you like.

        Comment


        • #5
          thanks Clyde Schechter for your advice!

          Comment


          • #6
            Anne-Claire Jo here is a stab at putting your columns into a loop, based on Clyde's improvement in #2 to your code in #1. It assumes that you actually made an error in specifying the addtext option for columns (2) and (4) in #1 -- I think you want the fixed effects mentioned there to be the same as the ones absorbed in the previous reghdfe command. If I am mistaken about this, please adjust the following code.

            Code:
            loc job JCR NJCR JDR JCR2 NJCR2 JDR2
            
            levelsof country, local(countries)
            foreach c of local countries{
                      
                noi di _n "Country: `c'"
                
                foreach var of local job {
                
                    loc model_name "5-year `var'"
                    loc FElbl13     i.cty#i.year i.cty#i.ind_a38
                    loc FElbl24        i.cty#i.ind_a38#i.year
            
            
                    forval col = 1/4 {
            
                        if inlist (`col', 1, 2) local xvars ib3.p_LogLP_VA LogL_av
                            else local xvars F1LogLP_VA_av ib3.p_LogLP_VA `var'_1L LogL_av
                    
                        if inlist(`col', 1, 3) local fe `FElbl13'
                            else local fe `FElbl124'
                        
                            
                        capture reghdfe `var'_5L `xvars' if country == "`c'" [aw=weightvar], a(`fe') vce(cluster i.cty#i.ind_a38) allbaselevels
                        if c(rc) == 0 {
                            levelsof country if e(sample)
                            local n_countries = `r(r)'
                            levelsof ind_a38 if e(sample)
                            local n_inds = `r(r)'
            
                            outreg2 using "5y_`var'_reg.xls", append ctitle(`model_name') label ///    
                                    addtext(Fixed effects, `fe', Countries, `n_countries', Industries, `n_inds')
                        }
                        else if !inlist(c(rc), 2000, 2001) {
                            display as error `"Unexpected regression error: var = `var', country = `c'"'
                        }
                    
                    }
                    
                }
            
            }
            Also: note how Clyde in #2, as well as I, indent the code for -if- blocks and -forval- and -foreach- loops, to make the code more readable. This is another general programming practice which is a good idea to learn (and is expected to be strictly followed in some languages, such as Python).

            Comment


            • #7
              Hemanshu Kumar Thanks a lot for your help! It's exactly what I was looking for, thanks again for your suggestion&insights

              Comment

              Working...
              X