Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • psmatch2 creating a table of before and after PSM

    Question: How do I construct a table showing the number of matches to assess covariate balance I.e no of matched treatments to number of controls?

    Click image for larger version

Name:	Capture.PNG
Views:	2
Size:	125.1 KB
ID:	1702952


    Unfortunately, this is neither part of the psmatch2 or pstest output. Therefore one has
    to construct the tables themselves.

    David Radwin Melissa Garrido Oyvind Snilsberg
    1. May I confirm this is correct? I have come up with the following code (using dummy data)
    webuse cattaneo2
    //Research question: Does alcohol affect bweight

    **Creating macros
    global treatment alcohol //intervention/treatment
    global ylist bweight //outcome
    global xlist mage medu msmoke //variables to match on t

    //Ensuring replication
    //Set seed to replicate the results

    set seed 1234
    gen sort_id = uniform()
    sort sort_id

    //Calculate psscore using psmatch2
    psmatch2 $treatment $xlist, outcome($ylist) neighbor(3) bw (0.06) common logit

    //Drawing graphs

    pstest $xlist, treated($treatment) both

    This is where I would appreciate your help
    //Creating tables to show matched treatments to controls

    //Drop the missing _weight values. If matched _weight = >0
    drop if _weight == .

    //Trying to summarize the number of matched treatment and controls

    tabstat $treatment=1 [aw=_weight], by($xlist)

    This returns an error with too many weights.

    I understand that if _weight > 0 this is equal to a control that has been matched

    _______
    Also, what are your thoughts on Stephen Porter's website who quotes someone else that psmatch2 shouldn't be used. What are your thoughts about this? As I was rather happy with psmatch2
    https://stephenporter.org/understand...atas-psmatch2/
    Attached Files
    Last edited by Denise Vella; 22 Feb 2023, 10:39.

  • #2
    psmatch2 is from SSC, as you are asked to explain (FAQ Advice #12). The matched sample is identified by nonmissing values of the generated "_weight" variable.

    Code:
    webuse cattaneo2, clear
    psmatch2 mbsmoke mmarried c.mage##c.mage fbaby medu, out(bweight) ate logit
    *FULL SAMPLE
    bys _treated: summarize if e(sample), sep(0)
    *MATCHED SAMPLE
    bys _treated: summarize if !missing(_weight), sep(0)
    Res.:

    Code:
    . *FULL SAMPLE
    
    .
    . bys _treated: summarize if e(sample), sep(0)
    
    ---------------------------------------------------------------------------------------------------------------------------------
    -> _treated = Untreated
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
         bweight |      3,778    3412.912    570.6871        340       5500
        mmarried |      3,778    .7514558    .4322261          0          1
           mhisp |      3,778    .0362626    .1869675          0          1
           fhisp |      3,778    .0378507    .1908604          0          1
         foreign |      3,778      .05982    .2371845          0          1
         alcohol |      3,778     .018793    .1358113          0          1
        deadkids |      3,778    .2458973     .430675          0          1
            mage |      3,778    26.81048    5.645477         13         45
            medu |      3,778    12.92986    2.534403          0         17
            fage |      3,778    27.84436    8.794065          0         60
            fedu |      3,778     12.6739    3.481189          0         17
       nprenatal |      3,778    10.96294     3.51843          0         33
        monthslb |      3,778    21.89836    31.50073          0        272
           order |      3,778    1.859185    1.103474          0         11
          msmoke |      3,778           0           0          0          0
         mbsmoke |      3,778           0           0          0          0
           mrace |      3,778    .8478031    .3592592          0          1
           frace |      3,778    .8268925    .3783902          0          1
        prenatal |      3,778    1.177607    .4726644          0          3
      birthmonth |      3,778    6.513764    3.346774          1         12
        lbweight |      3,778    .0489677    .2158291          0          1
           fbaby |      3,778    .4531498    .4978661          0          1
       prenatal1 |      3,778    .8268925    .3783902          0          1
         _pscore |      3,778    .1704241    .1051475   .0134296   .7907192
        _treated |      3,778           0           0          0          0
        _support |      3,778           1           0          1          1
         _weight |        286    3.020979    3.123131          1         19
        _bweight |      3,778    3164.002      633.16        680       4734
             _id |      3,778      1889.5    1090.759          1       3778
             _n1 |      3,778    4027.722    224.3399       3779       4642
             _nn |      3,778           1           0          1          1
           _pdif |      3,778    .0003409    .0019572          0   .0424453
    
    ---------------------------------------------------------------------------------------------------------------------------------
    -> _treated = Treated
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
         bweight |        864     3137.66    560.8931        397       5018
        mmarried |        864    .4733796      .49958          0          1
           mhisp |        864    .0243056    .1540853          0          1
           fhisp |        864    .0335648    .1802104          0          1
         foreign |        864     .025463    .1576177          0          1
         alcohol |        864    .0914352    .2883939          0          1
        deadkids |        864     .318287    .4660813          0          1
            mage |        864    25.16667    5.301348         14         43
            medu |        864    11.63889    2.167743          0         17
            fage |        864    24.74306    11.14795          0         55
            fedu |        864     10.7037    4.097048          0         17
       nprenatal |        864    9.862269     4.20762          0         40
        monthslb |        864    28.21991    36.92362          0        220
           order |        864     2.03588    1.182083          0         12
          msmoke |        864    2.146991    .7674824          1          3
         mbsmoke |        864           1           0          1          1
           mrace |        864    .8090278    .3932949          0          1
           frace |        864     .755787    .4298684          0          1
        prenatal |        864     1.30787    .6296153          0          3
      birthmonth |        864    6.655093    3.412406          1         12
        lbweight |        864    .1099537    .3130132          0          1
           fbaby |        864    .3715278     .483493          0          1
       prenatal1 |        864    .6898148    .4628372          0          1
         _pscore |        864    .2547889    .1304696   .0411911   .7889756
        _treated |        864           1           0          1          1
        _support |        864           1           0          1          1
         _weight |        288    13.11806    15.88986          1         80
        _bweight |        864    3334.843    626.2059        662       4933
             _id |        864      4210.5    249.5596       3779       4642
             _n1 |        864    2672.326    949.1435         38       3777
             _nn |        864           1           0          1          1
           _pdif |        864     .000148    .0009398          0   .0144565
    
    
    .
    . *MATCHED SAMPLE
    
    .
    . bys _treated: summarize if !missing(_weight), sep(0)
    
    ---------------------------------------------------------------------------------------------------------------------------------
    -> _treated = Untreated
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
         bweight |        286    3368.675    614.8157        662       4933
        mmarried |        286     .506993    .5008274          0          1
           mhisp |        286    .0664336     .249475          0          1
           fhisp |        286    .0699301    .2554762          0          1
         foreign |        286    .0804196    .2724183          0          1
         alcohol |        286    .0104895    .1020583          0          1
        deadkids |        286    .2762238    .4479126          0          1
            mage |        286    25.84615    5.921484         14         43
            medu |        286     11.6049     3.29931          0         17
            fage |        286    25.62238    11.39178          0         60
            fedu |        286    11.16783    4.308272          0         17
       nprenatal |        286     10.0979    4.071837          0         25
        monthslb |        286    24.28671     35.5531          0        199
           order |        286    2.052448    1.332737          0          8
          msmoke |        286           0           0          0          0
         mbsmoke |        286           0           0          0          0
           mrace |        286    .7342657    .4424977          0          1
           frace |        286    .6923077    .4623475          0          1
        prenatal |        286     1.29021    .6238325          0          3
      birthmonth |        286    6.482517    3.415862          1         12
        lbweight |        286    .0594406    .2368619          0          1
           fbaby |        286    .4055944    .4918674          0          1
       prenatal1 |        286    .7062937    .4562574          0          1
         _pscore |        286    .2466459    .1456376   .0411911   .7889756
        _treated |        286           0           0          0          0
        _support |        286           1           0          1          1
         _weight |        286    3.020979    3.123131          1         19
        _bweight |        286    3177.224    577.7134        680       4734
             _id |        286    2508.112    1094.366         38       3777
             _n1 |        286    4180.755    269.7205       3779       4642
             _nn |        286           1           0          1          1
           _pdif |        286    .0001703    .0011881          0   .0144565
    
    ---------------------------------------------------------------------------------------------------------------------------------
    -> _treated = Treated
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
         bweight |        288    3178.271    572.5312        680       4734
        mmarried |        288    .4965278    .5008582          0          1
           mhisp |        288    .0347222     .183394          0          1
           fhisp |        288    .0416667    .2001741          0          1
         foreign |        288      .03125    .1742955          0          1
         alcohol |        288      .09375     .291988          0          1
        deadkids |        288    .3090278     .462897          0          1
            mage |        288    25.84375    5.890987         14         43
            medu |        288    11.73264    2.929136          0         17
            fage |        288    25.19444    11.71007          0         55
            fedu |        288    10.51736    4.651594          0         17
       nprenatal |        288    9.659722    4.006377          0         21
        monthslb |        288    28.65278    41.24759          0        220
           order |        288    2.041667    1.240293          1         10
          msmoke |        288    2.107639    .7505241          1          3
         mbsmoke |        288           1           0          1          1
           mrace |        288    .7847222    .4117303          0          1
           frace |        288    .7430556    .4377091          0          1
        prenatal |        288    1.319444    .6374249          0          3
      birthmonth |        288    6.920139    3.268029          1         12
        lbweight |        288    .1006944    .3014475          0          1
           fbaby |        288      .40625    .4919872          0          1
       prenatal1 |        288    .6909722     .462897          0          1
         _pscore |        288    .2458135    .1451786   .0411911   .7889756
        _treated |        288           1           0          1          1
        _support |        288           1           0          1          1
         _weight |        288    13.11806    15.88986          1         80
        _bweight |        288    3367.649    618.2156        662       4933
             _id |        288    4179.389    268.6858       3779       4642
             _n1 |        288    2504.229    1095.575         38       3777
             _nn |        288           1           0          1          1
           _pdif |        288    .0001626    .0011807          0   .0144565
    
    
    .

    Comment


    • #3
      Originally posted by Andrew Musau View Post
      psmatch2 is from SSC, as you are asked to explain (FAQ Advice #12). The matched sample is identified by nonmissing values of the generated "_weight" variable.


      [/CODE]
      Dear Andrew, a clarification required with this statement
      So you say the matched sample is identified by 'non missing values of the generated _weight'

      So why does psmatch2 generate different summary total values in the Treatment assignment table:
      Click image for larger version

Name:	Capture.PNG
Views:	1
Size:	7.5 KB
ID:	1703305


      When compared with
      drop if _weight == .

      And if one for example looks at a particular variable, let's say:
      tab medu _treated, column

      The totals as you can see change (of course as the _weight == . dropped)

      Click image for larger version

Name:	Capture.PNG
Views:	2
Size:	2.4 KB
ID:	1703307


      I assume when one proceeds to look at side-by-side boxplots, non -parametric density box plots, although one plots the results following
      drop if _weight == .
      Although the cumulative total changes, the shape of the graphs don't as these tests are not based on sample sizes

      Is that intepretation correct?

      Attached Files

      Comment


      • #4
        I do not follow what you are asking. The treated units in the matched sample are a subsample of the treated units in the full dataset and the same for untreated units. Appending to the code in #2

        Code:
        assert !((_treated & !missing(_weight)) & (!_treated& e(sample)))
        assert !((!_treated & !missing(_weight)) & (_treated& e(sample)))
        we see that is the case. Res.:

        Code:
        . assert !((_treated & !missing(_weight)) & (!_treated& e(sample)))
        
        . assert !((!_treated & !missing(_weight)) & (_treated& e(sample)))
        
        .
        So in my example, out of the 3778 untreated observations in the full sample, 286 are selected in the matched sample. For treated units, 288/864 observations are selected into the matched sample. For a large number of variables, a table should do for a comparison of means - unless you have some select variables of interest in which case you may consider simple graphs such as bar graphs.
        Last edited by Andrew Musau; 24 Feb 2023, 10:57.

        Comment


        • #5
          Yes. I understood this was a sample of the code. Perhaps what I was asking was in reference to the results from psmatch2.

          i am trying to create a table to compare means +sd for continuous variables and number(%) for categorical with the SMD in the unmatched and matched sample.

          if I could make the post more articulate, after using psmatch2, the results of the total treated & total untreated

          ARE DIFFERENT

          when compared to when won
          1. performs psmatch2
          2. Drop if _weight = . (Eg. 400 observations dropped)
          3. tab medu _treated, column

          The totals as you can see change (see screenshot above)

          so with regards to categorical variables regarding n (%) for the table should I use the results from Step 3?

          hope my question is not too complex

          Comment


          • #6
            As I stated in #4, I can not follow your concern. Why would you expect that the totals will be the same before and after matching? I also do not know how detailed your understanding of the theory behind PSM is. If you look at the table that you attach in #1, you see that the matched sample is smaller than the full sample. As I stated in #4, it is a subsample of the full sample. What the table aims to show is that in terms of observables, the treated sample and control sample are closer in the matched sample compared to the full sample. I will now bail out of this thread, perhaps someone else can understand your concern which I fail to see.

            Comment


            • #7
              Yes. I understood this was a sample of the code. Perhaps what I was asking was in reference to the results from psmatch2.

              i am trying to create a table to compare means +sd for continuous variables and number(%) for categorical with the SMD in the unmatched and matched sample.

              if I could make the post more articulate, after using psmatch2, the results of the total treated & total untreated

              ARE DIFFERENT

              when compared to when won
              1. performs psmatch2
              2. Drop if _weight = . (Eg. 400 observations dropped)
              3. tab medu _treated, column

              The totals as you can see change (see screenshot above)

              so with regards to categorical variables regarding n (%) for the table should I use the results from Step 3?

              hope my question is not too complex

              Comment

              Working...
              X