Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Matching for treatment effects after imputation

    Dear Stata friends.

    I have a dataset with exposed (n=1,500) and unexposed (n=95,000) patients for which I have used the “kmatch” command for Mahalanobis distance matching to get an estimates treatment effects based on the matched observations. There are however some missing data in one of the matching variables. I am familiar with the “mi” imputation command and have used it to generate imputations.
    However, I can’t find a way to match after imputation (and I would rather want to generate treatment effects).
    Does anyone know a way to manage this?

    Regards

    Fredric Parenmark

  • #2
    it looks like -mi estimate:- won't work with kmatch. You'll have to find an alternative. You could do mi estimate: probit .... and then use that as a propensity score for matching.

    Comment


    • #3
      Originally posted by George Ford View Post
      it looks like -mi estimate:- won't work with kmatch. You'll have to find an alternative. You could do mi estimate: probit .... and then use that as a propensity score for matching.
      Thank you. I'll explore the probit model.

      Comment


      • #4
        List of commands compatible with mi estimate
        HTML Code:
        https://www.stata.com/manuals/miestimation.pdf

        Comment


        • #5
          Something like this might work, but I'd study it more (I'm just winging it).

          Code:
          webuse mheart5 , clear
          set seed 123456
          g treat = runiform()<0.2
          mi set mlong
          mi register imputed age bmi
          mi impute mvn age bmi = attack smokes hsgrad female, add(10)
          mi estimate, saving(miest, replace): probit treat smokes age bmi hsgrad female
          mi predict psim using miest, xb // pscore with imputation
          probit treat smokes age bmi hsgrad female
          predict psnoim, xb  //pscore ignoring imputation
          psmatch2 treat smokes age bmi hsgrad female, outcome(attack) // no impuation
          psmatch2 treat , outcome(attack) pscore(psnoim)    // confirm the same with forced pscore
          psmatch2 treat, outcome(attack) pscore(psim) // estiamte with imputation

          Comment


          • #6
            on further investigation, i'm pretty sure that's not going be correct, but may be useful as a starting point.

            Comment


            • #7
              look at this. has a discussion of doing things mi estimate does not support.
              HTML Code:
              https://errickson.net/stata2/multiple-imputation.html

              Comment


              • #8
                mi passive might help.

                help mi passive

                and look at "alternatives to mi passive"

                Comment


                • #9
                  Try this. Note formulas are from a website indicated in the program.

                  This runs kmatch on each imputed sample and then constructs the pooled tests using Rubin's formulas.

                  I'd definitely check it to make sure the formulas are correctly implemented here.

                  Probably could clean up the code a bit.

                  Results don't seem highly affected by number of imputed datasets, but cheap to do many. It may also vary by the dataset.

                  Code:
                  use https://www.stata-press.com/data/r17/mheart5.dta, clear
                  set seed 123456
                  
                  ** see what's missing
                  summ
                  
                  ** rough check on effect size
                  reg attack smokes hsgrad female 
                  kmatch md smokes age bmi hsgrad female (attack), att
                  matrix M = r(table)  //store raw results
                  
                  ** impute
                  mi set flong  // creates entire datasets with imputed values
                  mi register imputed age bmi
                  mi impute mvn age bmi = attack hsgrad female, add(100)  //creates 100 imputed datasets
                  
                  capture program drop mikmatch
                  program mikmatch
                  tempvar b
                  qui summ _mi_m
                  local N = r(max)
                  matrix R = J(`N',3,.)
                  matrix colnames R = att V df
                  forv i = 1/`N' {
                      qui kmatch md smokes age bmi hsgrad female (attack) if _mi_m==`i', att
                      matrix A = r(table)
                      matrix R[`i',1] = A[1,1]
                      matrix R[`i',2] = A[2,1]^2
                      matrix R[`i',3] = e(df_r)
                  }
                  capture drop R*
                  svmat R , names(matcol)
                  capture drop att V df
                  ren R* *
                  *https://bookdown.org/mwheymans/bookmi/rubins-rules.html
                  qui summ att
                  local att = r(mean)
                  qui summ df 
                  local df = r(mean)
                  qui summ V
                  local VW = r(mean)
                  qui g `b' = (att-`att')^2
                  qui summ `b'
                  local VB = r(sum)/(`N'-1)
                  local VT = `VW'+`VB'+`VB'/`N'
                  local L = (`VB'+`VB'/`N')/`VT'
                  local df0 = (`N'-1)/(`L'^2)
                  local df1 = (`df'+1)/(`df'+3)*`df'*(1-`L')
                  local dfA = (`df0'*`df1')/(`df0'+`df1')
                  local WALD = ((`att'-0)^2)/`VT'  // null is 0
                  di _dup(50) "-"
                  di _col(10) "RESULTS (Datasets = `N')"
                  di _dup(50) "-"
                  di _col(15) "Raw Data"  _col(30) "Imputed"
                  di _dup(50) "-"
                  di "ATT   = " _col(15) %5.3f M[1,1] _col(30) %5.3f `att'
                  di "WALDt = " _col(15) %5.3f M[3,1] _col(30) %5.3f `WALD'
                  di "Prob  = " _col(15) %5.3f M[4,1] _col(30) %5.3f  ttail(`dfA',`WALD')
                  di _dup(50) "-"
                  hist att
                  end
                  mikmatch

                  Comment


                  • #10

                    Code:
                     di "WALDt = " _col(15) %5.3f M[3,1]^2 _col(30) %5.3f `WALD'
                    will make the test-stats more comparable.

                    Comment


                    • #11
                      This added teffect using mahal with nn matching.

                      Code:
                      capture program drop mikmatch
                      program mikmatch
                      tempvar b b2
                      qui summ _mi_m
                      local N = r(max)
                      matrix R = J(`N',5,.)
                      matrix colnames R = att V df attt Vt
                      forv i = 1/`N' {
                          qui kmatch md smokes age bmi hsgrad female (attack) if _mi_m==`i', att
                          matrix A = r(table)
                          matrix R[`i',1] = A[1,1]
                          matrix R[`i',2] = A[2,1]^2
                          matrix R[`i',3] = e(df_r)
                      }
                      forv i = 1/`N' {
                          qui teffects nnmatch (attack  age bmi hsgrad female) (smokes) if _mi_m==`i', atet metric(mahal)
                          matrix Z = r(table)
                          matrix R[`i',4] = Z[1,1]
                          matrix R[`i',5] = Z[2,1]^2
                          }
                      capture drop R*
                      svmat R , names(matcol)
                      capture drop att V df attt Vt
                      ren R* *
                      *https://bookdown.org/mwheymans/bookmi/rubins-rules.html
                      qui summ att
                      local att = r(mean)
                      qui summ df
                      local df = r(mean)
                      qui summ V
                      local VW = r(mean)
                      qui g `b' = (att-`att')^2
                      qui summ `b'
                      local VB = r(sum)/(`N'-1)
                      local VT = `VW'+`VB'+`VB'/`N'
                      local L = (`VB'+`VB'/`N')/`VT'
                      local df0 = (`N'-1)/(`L'^2)
                      local df1 = (`df'+1)/(`df'+3)*`df'*(1-`L')
                      local dfA = (`df0'*`df1')/(`df0'+`df1')
                      local WALD = ((`att'-0)^2)/`VT'  // null is 0
                      
                      qui summ attt
                      local attt = r(mean)
                      qui summ Vt
                      local VW = r(mean)
                      qui g `b2' = (attt-`attt')^2
                      qui summ `b2'
                      local VB = r(sum)/(`N'-1)
                      local VT = `VW'+`VB'+`VB'/`N'
                      local L = (`VB'+`VB'/`N')/`VT'
                      local WALD2 = ((`attt'-0)^2)/`VT'  // null is 0
                      
                      
                      di _dup(70) "-"
                      di _col(25) "RESULTS (Datasets = `N')"
                      di _dup(70) "-"
                      di _col(22) "KMATCH" _col(52) "TEEFFECTS"
                      di _col(15) "Raw Data"  _col(30) "Imputed" _col(45) "Raw Data" _col(60) "Imputed"
                      di _dup(70) "-"
                      di "ATT   = " _col(15) %5.3f M[1,1] _col(30) %5.3f `att'  _col(45) %5.3f T[1,1]   _col(60) %5.3f `attt'
                      di "WALDt = " _col(15) %5.3f M[3,1]^2 _col(30) %5.3f `WALD' _col(45) %5.3f T[3,1]^2 _col(60) %5.3f `WALD2'
                      di "Prob  = " _col(15) %5.3f M[4,1] _col(30) %5.3f  ttail(`dfA',`WALD') _col(45) %5.3f T[4,1] _col(60) %5.3f  ttail(`dfA',`WALD2')
                      di _dup(70) "-"
                      twoway kdensity att || kdensity attt
                      end
                      mikmatch
                      Last edited by George Ford; 18 Feb 2023, 13:54.

                      Comment

                      Working...
                      X