Looping through local of covariates for every combination of the list

Chris Larkin

Join Date: Apr 2016

Posts: 296
#1

Looping through local of covariates for every combination of the list

22 Sep 2016, 06:41

Hello Statalisters,

I have an interesting problem that I figure someone can help me with. I'm looking to estimate a propensity score model and am attempting to design a simulation that tells me which specification of covariates achieves the best balance. My treatment indicator is binary {0,1} and I have a list of covariates x1,x2,...,xn. I would like to loop through my list of covariates and regress every possible combination of the covariates onto treatment condition; I will then want to store the R²along with the covariate specification and isolate the covariate specification that achieves the highest R².

I hope the below, while completely inaccurate code (more like pseudocode), helps clarify my thinking.

Code:

sysuse auto, clear local emptyx local covars foreign length rep78 price foreach var of local covars{ reg mpg `emptyx' `covars', r local R2 = e(r2) + `covars' }

And then some way of identifying the model with the highest R². Any thoughts people have would be appreciated!
Tags: None
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#2

22 Sep 2016, 08:06

The command -allpossible-, available via -ssc install allpossible-, will estimate a regression model for all possible subsets of a list of covariates. In your situation, taking what I think you mean rather than what you literally say, I would think that you want to (logistic) regress the 0/1 treatment on all possible subsets of your covariates. -allpossible- will provide a convenient display of your models and such summary statistics as you request. Although I'm not a fan of the particular pseudo-R2 that -logit- offers, it's what's conveniently available. I don't see any easy way to get -allpossible- to store your results, but the display it provides may offer what you want::

Code:

sysuse auto allpossible logit foreign mpg headroom weight price, eclass(r2_p)
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#3

22 Sep 2016, 08:19

Thanks to Mike for the mention, but I'd flag that allpossible (SSC) documents its own limitations

The number of models fitted must not exceed the number of observations.

The upper limit of 6 is not a matter of principle. The limit may be changed in a future
version.

Naturally, this command does not purport to replace the detailed scrutiny of individual
models or to offer an unproblematic way of finding "best" models. Its main use may lie in
demonstrating that several models exist within many projects possessing roughly equal
merit as measured by omnibus statistics.

and I am confident that it will never be extended or otherwise revised by me.

Instead, consider tuples (SSC). tuples is much more general in producing multiple lists from master lists with many more objects than 1 to 6, and not just variable names.

You're expected to take responsibility for what you do with any results, including their collation and comparison.
Comment

Chris Larkin

Join Date: Apr 2016
Posts: 296

22 Sep 2016, 11:04

Thanks both! I've only been using Statalist for a short while but am surprised by how active the community is - it's fantastic. I tried out -- allpossible -- Mike, but as Nick suspected it's not versatile enough for my needs. In the end I went with the (inevitably clumsy, yet functional) snippet below using the excellent -- tuples -- command.

Code:

  preserve 
    clear
    gen tuple =""
    gen r2 =.
    save "${dtadrive}/modelspec.dta", replace
    restore
    
    tuples x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14, display
    forvalues i=1/16383 {
        qui: use "${dtadrive}/MergedAndCleanProj4.2.dta", clear
        qui: tempfile modelspec`i'
        qui reg Treated `tuple`i'', r
        local r2 = e(r2)
        qui: gen tuple = "`tuple`i''"
        qui: gen r2 = `r2'
        qui: keep in 1
        qui: keep tuple r2
        qui: save `modelspec`i''
        qui: use "${dtadrive}/modelspec.dta", clear
        qui: append using `modelspec`i''
        qui: save "${dtadrive}/modelspec.dta", replace
        di "`i' interations complete"
    }

I'm sure there's a neater way of doing this using master tempfile and bringing that in each time instead of my master Stata binary file, but alas!

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35698
#5

22 Sep 2016, 11:21

Code:

help postfile
1 like
Comment

Announcement