Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • dynamic looped regressions

    Dear Stata colleagues:

    I am using LASSO for inference. I want to independently assess the causality of 40 policy variables on a rate. Some of these vars are likely colinear. I would like to correct the loop below to: 1) automatically remove each x var from the control var pool when it is used as the variable of interest (I currently get an error message); 2) test for collinearity between the var of interest and each control x1-x40 and remove all collinears x1-x40 from the control pool for each regression. Note I have a handful of other control vars in that var pool that I do not want to subject to this restriction as I have a few quadratic terms.

    Code:
    foreach v of varlist x1-x40 { xpopoisson Mortality_rate `v', controls(rural gee cce fsi GDP_cap GDP_cap_sqr GHE_cap GHE_cap_sqr x1-x40) selection(cv) vce(cluster country_code) }
    Thank you in advance for any guidance you may be able to provide.

    Robert
    Last edited by Robert Kolesar; 31 Dec 2022, 10:11.

  • #2
    This message has been deleted.
    Last edited by Robert Kolesar; 31 Dec 2022, 10:13. Reason: I corrected the formatting in the original post.

    Comment


    • #3
      Well, the naming of the variables as x1-x40 makes this a bit complicated, and overall life is simpler if we rename them as x01-x40. Then the following will remove the variable of interest from the list of covariates.

      Code:
      rename x# x(##)
      unab xlist: x01-x40
      
      foreach v of varlist `xlist' {
          local covariates: subinstr local xlist "`v'" ""
          xpopoisson Mortality_rate `v', controls(rural gee cce fsi ///
              GDP_cap GDP_cap_sqr GHE_cap GHE_cap_sqr `covariates') ///
              selection(cv) vce(cluster country_code)
      }
      Note: No example data, so this code is not tested, but I believe it is correct.

      As for dealing with variables that are colinear with the x's, I'm not sure what you want. If you have pairs of variables among x01-x40 that are colinear (i.e. correlated 1 or -1) it is easy enough to remove one from each pair, say, keeping the first only and using the resulting reduced list of x variables for your -xpopoisson- loop:

      Code:
      corr `xlist'
      matrix M = r(C)
      local d = rowsof(M)
      local to_remove
      forvalues i = 1/`d' {
          forvalues j = `=`i'+1'/`d' {
              if abs(M[`i', `j']) == 1 {
                  local to_remove `to_remove' `:word `j' of `xlist''
              }
          }
      }
      foreach t of local to_remove {
          local xlist: subinstr local xlist "`t'" ""
      }
      (Again, untested.)
      But I don't have a clear sense whether this is what you want. Generally speaking, colinearities are, in practice, more complicated than just two perfectly correlated variables. Usually it is a situation of x_i being a linear combination of several other x_j. And often that colinearity can be broken by just removing one of the variables. But there is no clear basis I can see for chosing which among the colinear variables to remove. And the choice could well have a large effect on your -xpopoisson- results. Indeed, from a set of colinear variables you might end up removing precisely the one that -xpopoisson- would select as being inferrentially important. Moreover, I don't see the point in this context. One of the things that lasso handles well is colinearity among variables. The penalization process weeds out colinearities for you. So why not just let lasso do its job?

      Comment


      • #4
        Dear Clyde, This worked perfectly! Thanks a ton. I have one follow-up question which is how to save and combine the results from each regression? I have been studying your post: Looping Regression with Output - Statalist. However, I am not able to adapt it for this case. I think the issue is that the results I want to compile include a matrix and scalers: r(table) e(p) e(chi2) ereturn e(k_controls_sel). What do you think?

        Robert

        Comment


        • #5
          Well, the matrix r(table) has the potential to be troublesome. But at least in the code you have shown, you are doing a simple -xpopoisson- with only one independent variable--and as the covariates, which will differ in number from one lasso to the next, are not included in r(table), it's just a matter of reading off the elements of a 9x1 matrix into variables.

          Code:
          rename x# x(##)
          unab xlist: x01-x40
          
          frame create results str32 vble float (b se z pvalue ll ul df crit ///
              eform p chi2 k_controls_sel)
          foreach v of varlist `xlist' {
              local covariates: subinstr local xlist "`v'" ""
              xpopoisson Mortality_rate `v', controls(rural gee cce fsi ///
                  GDP_cap GDP_cap_sqr GHE_cap GHE_cap_sqr `covariates') ///
                  selection(cv) vce(cluster country_code)
              matrix M = r(table)
              local topost ("`v'")
              forvalues i = 1/9 {
                  local topost `topost' (M[`i', 1])
              }
              frame post results `topost' (e(p)) (e(chi2)) e(k_controls_sel)
          }
          At the end of the code, the data set in frame results will have what you want. You can work with it however you like from there.

          Added thought: given that your -xpopoisson- has only a single independent variable listed, e(p) will be the same as the pvalue you get from r(table). There is no need to store both. In terms of the above code, it would be easiest to get rid of p in the -frame create- command and (e(p)) in the -frame post- command. However, if you plan to move to a larger model, then that equality will no longer hold, so you would have to undo this particular code change, which means it's not worth doing in the first place.

          Comment


          • #6
            Thanks so much Clyde! This is extremely helpful. I was not familiar with the frame post command. I really appreciate your added comments on both of your replies. Happy 2023!!!

            Comment

            Working...
            X