Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Coarsened exact matching, drop unmatched

    Hi, I am doing a coarsened exact matching and I am considered about the dropped observations. I match on the pre-treatment period like this:
    drop if treated==.
    gen cem_treated = 1 if treated=1 & Year<2011
    replace cem_treated=0 if cem_treated==.

    Then I use cem_treated as variable to match on exactly like this:
    cem Employees(#0) sector(#0), tr(cem_treated)

    Before I drop cem_matched==0 as I should I have equal number of observations in the control group (each year) and equal number in the treatment group. When I drop cem_matched==0, I of course delete unmatched controls (but it is still equally many before and after treatment). But for the treatment group, a lot of observations are dropped for the POST-treatment period. The reason for that can be missing values in the difference in difference variables but it is not.

    I tried to match on the whole period as well to check for that. Then it works fine, it only drops control variables and a few treated but it is equally many before and after treatment. The command was in that case:
    cem Employees(#0) sector(#0), tr(treated)

    Do anyone know why variables are dropped in the post-period when matching is done for the pre-period?

    Thanks in advance

  • #2
    Are the same people observed in both pretreat and posttreat periods?
    Steve Samuels
    Statistical Consulting
    [email protected]

    Stata 14.2

    Comment


    • #3
      Before posting questions in the future, please follow the instructions in FAQ 12 about creating a data extract (with dataex) and about pasting all commands, results, and listings between CODE delimiters.

      t's actually easy to diagnose the problem. I'll outline the solution after illustrating the problem, assuming that each subject was studied pre- and post- treatment.

      Basically, the problem is that your command:
      Code:
      replace cem_treated=0 if cem_treated==.
      made everybody in the post-treatment period, including an actual treated subject, a control as far as cem was concerned. cem analyzed the entire data set, pre- and post- and, naturally could find no-one with cem_treated=1 to match to. Here's an illustration of what happened, the treatment variable is mbsmoke, equal to one for smokers.
      Code:
      . use http://www.stata-press.com/data/r14/cattaneo3, clear
      
      . gen year = (order>1) +1  // years 1  & 2 for pre & post
      
      . tab year mbsmoke, missing
      
                 |  1 if mother smoked
            year | nonsmoker     smoker |     Total
      -----------+----------------------+----------
               1 |     1,715        322 |     2,037
               2 |     2,063        542 |     2,605
      -----------+----------------------+----------
           Total |     3,778        864 |     4,642
      
      . gen cem_treat = mbsmoke & year <2
      
      . tab  year cem_treat, missing // no "cem_treated" in year 2
      
                 |       cem_treat
            year |         0          1 |     Total
      -----------+----------------------+----------
               1 |     1,715        322 |     2,037
               2 |     2,605          0 |     2,605
      -----------+----------------------+----------
           Total |     4,320        322 |     4,642
      
      . cem mage prenatal1 mmarried fbaby , treat(cem_treat)
      
      Matching Summary:
      -----------------
      Number of strata: 91
      Number of matched strata: 34
      
                    0     1
            All  4320   322
        Matched  1702   320
      Unmatched  2618     2
      [Other output omitted]
      
      . tab cem_match mbsmoke if year==2, missing
       // 530 of 542 true treated observations not matched
      
      cem_matche |  1 if mother smoked
               d | nonsmoker     smoker |     Total
      -----------+----------------------+----------
               0 |     2,046        530 |     2,576
               1 |        17         12 |        29
      -----------+----------------------+----------
           Total |     2,063        542 |     2,605
       
      . sum cem_strat cem_treat mbsmoke mage prenatal1 mmarried fbaby ///
      > mbsmoke if cem_match & year==2
      
          Variable |        Obs        Mean    Std. Dev.       Min        Max
      -------------+---------------------------------------------------------
        cem_strata |         29          64           0         64         64
         cem_treat |         29           0           0          0          0
           mbsmoke |         29    .4137931      .50123          0          1
              mage |         29    26.37931     .493804         26         27
         prenatal1 |         29           1           0          1          1
      -------------+---------------------------------------------------------
          mmarried |         29           0           0          0          0
             fbaby |         29           0           0          0          0
           mbsmoke |         29    .4137931      .50123          0          1
      cem created one matching stratum #6 in year 2 even though cem_treat is zero. I'm not sure why.

      Here's the outline of the fix

      1. Create two data sets, one for period 1 (pre) and one for period 2 (post). l'll them t1 & t2, respectively.
      2. Run cem on the period 1 data (t1) using the original treated variable. Save the results in data set t3
      3. Use t2 and keep just the subject IDs
      4. Merge 1:1 by id with data t3. Drop unmatched subjects (cem_matched = 0)
      5. Merge 1:1 by id with data t2. Keep only subjects with _merge = 3.
      6. Append this data set to t3 and analyze.
      Last edited by Steve Samuels; 27 Apr 2016, 16:11.
      Steve Samuels
      Statistical Consulting
      [email protected]

      Stata 14.2

      Comment


      • #4
        The number of the matching stratum was 64, not 6. Sorry for the typo.
        Steve Samuels
        Statistical Consulting
        [email protected]

        Stata 14.2

        Comment


        • #5
          Hi, thanks a lot for the clarification. It was very helpful and not it works fine.

          Thanks again.

          Comment

          Working...
          X