Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping through local of covariates for every combination of the list

    Hello Statalisters,

    I have an interesting problem that I figure someone can help me with. I'm looking to estimate a propensity score model and am attempting to design a simulation that tells me which specification of covariates achieves the best balance. My treatment indicator is binary {0,1} and I have a list of covariates x1,x2,...,xn. I would like to loop through my list of covariates and regress every possible combination of the covariates onto treatment condition; I will then want to store the R2 along with the covariate specification and isolate the covariate specification that achieves the highest R2.

    I hope the below, while completely inaccurate code (more like pseudocode), helps clarify my thinking.


    Code:
    sysuse auto, clear
    
    local emptyx 
    local covars foreign length rep78 price
    foreach var of local covars{
             reg mpg `emptyx' `covars', r
             local R2 = e(r2) + `covars'
    }
    And then some way of identifying the model with the highest R2. Any thoughts people have would be appreciated!











  • #2
    The command -allpossible-, available via -ssc install allpossible-, will estimate a regression model for all possible subsets of a list of covariates. In your situation, taking what I think you mean rather than what you literally say, I would think that you want to (logistic) regress the 0/1 treatment on all possible subsets of your covariates. -allpossible- will provide a convenient display of your models and such summary statistics as you request. Although I'm not a fan of the particular pseudo-R2 that -logit- offers, it's what's conveniently available. I don't see any easy way to get -allpossible- to store your results, but the display it provides may offer what you want::
    Code:
    sysuse auto
    allpossible logit foreign mpg headroom weight price, eclass(r2_p)

    Comment


    • #3
      Thanks to Mike for the mention, but I'd flag that allpossible (SSC) documents its own limitations

      The number of models fitted must not exceed the number of observations.

      The upper limit of 6 is not a matter of principle. The limit may be changed in a future
      version.

      Naturally, this command does not purport to replace the detailed scrutiny of individual
      models or to offer an unproblematic way of finding "best" models. Its main use may lie in
      demonstrating that several models exist within many projects possessing roughly equal
      merit as measured by omnibus statistics.
      and I am confident that it will never be extended or otherwise revised by me.

      Instead, consider tuples (SSC). tuples is much more general in producing multiple lists from master lists with many more objects than 1 to 6, and not just variable names.

      You're expected to take responsibility for what you do with any results, including their collation and comparison.

      Comment


      • #4
        Thanks both! I've only been using Statalist for a short while but am surprised by how active the community is - it's fantastic. I tried out -- allpossible -- Mike, but as Nick suspected it's not versatile enough for my needs. In the end I went with the (inevitably clumsy, yet functional) snippet below using the excellent -- tuples -- command.



        Code:
          preserve 
            clear
            gen tuple =""
            gen r2 =.
            save "${dtadrive}/modelspec.dta", replace
            restore
            
            tuples x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14, display
            forvalues i=1/16383 {
                qui: use "${dtadrive}/MergedAndCleanProj4.2.dta", clear
                qui: tempfile modelspec`i'
                qui reg Treated `tuple`i'', r
                local r2 = e(r2)
                qui: gen tuple = "`tuple`i''"
                qui: gen r2 = `r2'
                qui: keep in 1
                qui: keep tuple r2
                qui: save `modelspec`i''
                qui: use "${dtadrive}/modelspec.dta", clear
                qui: append using `modelspec`i''
                qui: save "${dtadrive}/modelspec.dta", replace
                di "`i' interations complete"
            }
        I'm sure there's a neater way of doing this using master tempfile and bringing that in each time instead of my master Stata binary file, but alas!


        Comment


        • #5
          Code:
          help postfile

          Comment

          Working...
          X