Matching for treatment effects after imputation

Fredric Parenmark

Join Date: Jun 2020

Posts: 4
#1

Matching for treatment effects after imputation

16 Feb 2023, 14:29

Dear Stata friends.

I have a dataset with exposed (n=1,500) and unexposed (n=95,000) patients for which I have used the “kmatch” command for Mahalanobis distance matching to get an estimates treatment effects based on the matched observations. There are however some missing data in one of the matching variables. I am familiar with the “mi” imputation command and have used it to generate imputations.
However, I can’t find a way to match after imputation (and I would rather want to generate treatment effects).
Does anyone know a way to manage this?

Regards

Fredric Parenmark
Tags: None
George Ford

Join Date: Aug 2014

Posts: 3123
#2

16 Feb 2023, 16:19

it looks like -mi estimate:- won't work with kmatch. You'll have to find an alternative. You could do mi estimate: probit .... and then use that as a propensity score for matching.
Comment
Fredric Parenmark

Join Date: Jun 2020

Posts: 4
#3

17 Feb 2023, 05:08

Originally posted by George Ford View Post

it looks like -mi estimate:- won't work with kmatch. You'll have to find an alternative. You could do mi estimate: probit .... and then use that as a propensity score for matching.

Thank you. I'll explore the probit model.
Comment
George Ford

Join Date: Aug 2014

Posts: 3123
#4

17 Feb 2023, 07:58

List of commands compatible with mi estimate

HTML Code:

https://www.stata.com/manuals/miestimation.pdf
Comment

George Ford

Join Date: Aug 2014
Posts: 3123

17 Feb 2023, 08:16

Something like this might work, but I'd study it more (I'm just winging it).

Code:

webuse mheart5 , clear
set seed 123456
g treat = runiform()<0.2
mi set mlong
mi register imputed age bmi
mi impute mvn age bmi = attack smokes hsgrad female, add(10)
mi estimate, saving(miest, replace): probit treat smokes age bmi hsgrad female
mi predict psim using miest, xb // pscore with imputation
probit treat smokes age bmi hsgrad female
predict psnoim, xb  //pscore ignoring imputation
psmatch2 treat smokes age bmi hsgrad female, outcome(attack) // no impuation
psmatch2 treat , outcome(attack) pscore(psnoim)    // confirm the same with forced pscore
psmatch2 treat, outcome(attack) pscore(psim) // estiamte with imputation

Comment

George Ford

Join Date: Aug 2014

Posts: 3123
#6

17 Feb 2023, 10:40

on further investigation, i'm pretty sure that's not going be correct, but may be useful as a starting point.
Comment
George Ford

Join Date: Aug 2014

Posts: 3123
#7

17 Feb 2023, 10:41

look at this. has a discussion of doing things mi estimate does not support.

HTML Code:

https://errickson.net/stata2/multiple-imputation.html
Comment
George Ford

Join Date: Aug 2014

Posts: 3123
#8

17 Feb 2023, 11:04

mi passive might help.

help mi passive

and look at "alternatives to mi passive"
Comment

George Ford

Join Date: Aug 2014
Posts: 3123

18 Feb 2023, 10:47

Try this. Note formulas are from a website indicated in the program.

This runs kmatch on each imputed sample and then constructs the pooled tests using Rubin's formulas.

I'd definitely check it to make sure the formulas are correctly implemented here.

Probably could clean up the code a bit.

Results don't seem highly affected by number of imputed datasets, but cheap to do many. It may also vary by the dataset.

Code:

use https://www.stata-press.com/data/r17/mheart5.dta, clear
set seed 123456

** see what's missing
summ

** rough check on effect size
reg attack smokes hsgrad female 
kmatch md smokes age bmi hsgrad female (attack), att
matrix M = r(table)  //store raw results

** impute
mi set flong  // creates entire datasets with imputed values
mi register imputed age bmi
mi impute mvn age bmi = attack hsgrad female, add(100)  //creates 100 imputed datasets

capture program drop mikmatch
program mikmatch
tempvar b
qui summ _mi_m
local N = r(max)
matrix R = J(`N',3,.)
matrix colnames R = att V df
forv i = 1/`N' {
    qui kmatch md smokes age bmi hsgrad female (attack) if _mi_m==`i', att
    matrix A = r(table)
    matrix R[`i',1] = A[1,1]
    matrix R[`i',2] = A[2,1]^2
    matrix R[`i',3] = e(df_r)
}
capture drop R*
svmat R , names(matcol)
capture drop att V df
ren R* *
*https://bookdown.org/mwheymans/bookmi/rubins-rules.html
qui summ att
local att = r(mean)
qui summ df 
local df = r(mean)
qui summ V
local VW = r(mean)
qui g `b' = (att-`att')^2
qui summ `b'
local VB = r(sum)/(`N'-1)
local VT = `VW'+`VB'+`VB'/`N'
local L = (`VB'+`VB'/`N')/`VT'
local df0 = (`N'-1)/(`L'^2)
local df1 = (`df'+1)/(`df'+3)*`df'*(1-`L')
local dfA = (`df0'*`df1')/(`df0'+`df1')
local WALD = ((`att'-0)^2)/`VT'  // null is 0
di _dup(50) "-"
di _col(10) "RESULTS (Datasets = `N')"
di _dup(50) "-"
di _col(15) "Raw Data"  _col(30) "Imputed"
di _dup(50) "-"
di "ATT   = " _col(15) %5.3f M[1,1] _col(30) %5.3f `att'
di "WALDt = " _col(15) %5.3f M[3,1] _col(30) %5.3f `WALD'
di "Prob  = " _col(15) %5.3f M[4,1] _col(30) %5.3f  ttail(`dfA',`WALD')
di _dup(50) "-"
hist att
end
mikmatch

Comment

George Ford

Join Date: Aug 2014

Posts: 3123
#10

18 Feb 2023, 13:23

Code:

di "WALDt = " _col(15) %5.3f M[3,1]^2 _col(30) %5.3f `WALD'

will make the test-stats more comparable.
Comment

George Ford

Join Date: Aug 2014
Posts: 3123

#11

18 Feb 2023, 13:52

This added teffect using mahal with nn matching.

Code:

capture program drop mikmatch
program mikmatch
tempvar b b2
qui summ _mi_m
local N = r(max)
matrix R = J(`N',5,.)
matrix colnames R = att V df attt Vt
forv i = 1/`N' {
    qui kmatch md smokes age bmi hsgrad female (attack) if _mi_m==`i', att
    matrix A = r(table)
    matrix R[`i',1] = A[1,1]
    matrix R[`i',2] = A[2,1]^2
    matrix R[`i',3] = e(df_r)
}
forv i = 1/`N' {
    qui teffects nnmatch (attack  age bmi hsgrad female) (smokes) if _mi_m==`i', atet metric(mahal)
    matrix Z = r(table)
    matrix R[`i',4] = Z[1,1]
    matrix R[`i',5] = Z[2,1]^2
    }
capture drop R*
svmat R , names(matcol)
capture drop att V df attt Vt
ren R* *
*https://bookdown.org/mwheymans/bookmi/rubins-rules.html
qui summ att
local att = r(mean)
qui summ df
local df = r(mean)
qui summ V
local VW = r(mean)
qui g `b' = (att-`att')^2
qui summ `b'
local VB = r(sum)/(`N'-1)
local VT = `VW'+`VB'+`VB'/`N'
local L = (`VB'+`VB'/`N')/`VT'
local df0 = (`N'-1)/(`L'^2)
local df1 = (`df'+1)/(`df'+3)*`df'*(1-`L')
local dfA = (`df0'*`df1')/(`df0'+`df1')
local WALD = ((`att'-0)^2)/`VT'  // null is 0

qui summ attt
local attt = r(mean)
qui summ Vt
local VW = r(mean)
qui g `b2' = (attt-`attt')^2
qui summ `b2'
local VB = r(sum)/(`N'-1)
local VT = `VW'+`VB'+`VB'/`N'
local L = (`VB'+`VB'/`N')/`VT'
local WALD2 = ((`attt'-0)^2)/`VT'  // null is 0


di _dup(70) "-"
di _col(25) "RESULTS (Datasets = `N')"
di _dup(70) "-"
di _col(22) "KMATCH" _col(52) "TEEFFECTS"
di _col(15) "Raw Data"  _col(30) "Imputed" _col(45) "Raw Data" _col(60) "Imputed"
di _dup(70) "-"
di "ATT   = " _col(15) %5.3f M[1,1] _col(30) %5.3f `att'  _col(45) %5.3f T[1,1]   _col(60) %5.3f `attt'
di "WALDt = " _col(15) %5.3f M[3,1]^2 _col(30) %5.3f `WALD' _col(45) %5.3f T[3,1]^2 _col(60) %5.3f `WALD2'
di "Prob  = " _col(15) %5.3f M[4,1] _col(30) %5.3f  ttail(`dfA',`WALD') _col(45) %5.3f T[4,1] _col(60) %5.3f  ttail(`dfA',`WALD2')
di _dup(70) "-"
twoway kdensity att || kdensity attt
end
mikmatch

Last edited by George Ford; 18 Feb 2023, 13:54.

Announcement