Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Propensity score matching

    Good morning,

    I would like to evaluate an effect of a shock (my treatment, let's call T) in a variable (Y) using X as my covariate (a binary variable), using a propensity score matching. I would like to do the following:

    1) Estimate the propensity score using a Logit model
    2) Apply a matching algorithm (kernel matching) using the differences in the propensity score.

    I have been looking and I have found two ways to do it in stata: psmatch2 and kmatch.

    With psmatch2, I will do in this way: psmatch2 T, outcome(Y) pscore(X) kerneltype(uniform) logit

    With kmatch, I will do in this way: kmatch ps T X (I am not sure where to put the outcome with this command)

    Do I am doing well with psmatch2?

    On the other hand, I would like to ask you (probably it is a bit theorerical question more than a question of using stata) if it is possible to add time fixed effect (year) and country fixed effect in this strategy?. If so, so have I to include them in pscore?

    Best,

    Diego.

  • #2
    In kmatch the syntax would be like

    Code:
    kmatch ps treatment xvars (outcome)
    To include FEs, you can use exact matching I assume, like

    Code:
    kmatch ps treatment xvars (outcome), ematch(country)
    Best wishes

    (Stata 16.1 MP)

    Comment


    • #3
      Dear Felix Bittmann

      Thank you for your answer.

      If I also want to include year fixed effect, I should write kmatch ps treatment xvars (outcome), ematch(country year) ? or i.country, i.year?

      On the other hand, for exact matching I understand that I match two units with the same pscore, is it what you refer with exact matching when I add ematch? From What I have seen ematch can only be used with kmatch ra instead of kmatch ps.

      Best,

      Diego.
      Last edited by Diego Malo; 15 Jun 2021, 02:01.

      Comment


      • #4
        What ematch does is that it only matches within a given group and hence creates an exact match. For example, if you enter country, only people are matched to people from the same country. I think this is the most you can get to FE as possible. If you enter country as a regular xvar, it boils down to the PS so potentially the matches are not exact any more. Make sure to test balancing afterwards using kmatch sum. And yes, you can include as many variables in ematch as you like, but to work well the sample must be large and diverse enough (so small countries might be a problem).
        And as fas as I know ematch should work fine with the regular kmatch ps.
        Best wishes

        (Stata 16.1 MP)

        Comment


        • #5
          Thank you for your answer Felix Bittmann . It helps me a lot. I appreciate it!

          I will check the balacing property using command pscore. If my intention is to add FE with country and year, should I check my balacing property with that covariates or just with my initial binary variable X?

          Thank you again,

          Best regards,

          Diego.

          Comment


          • #6
            The command will always check all variables in the models (and as long as the algorithm converges, the exact vars are always perfectly matched).
            Best wishes

            (Stata 16.1 MP)

            Comment


            • #7
              But is it possible to check the balacing property with kmatch? I was thinking to check like that: pscore treatment xvars, where Xvars is just my binary covariate.

              Comment


              • #8
                Sure, simply run
                Code:
                kmatch sum
                after the matching command.
                Best wishes

                (Stata 16.1 MP)

                Comment


                • #9
                  Originally posted by Felix Bittmann View Post
                  Sure, simply run
                  Code:
                  kmatch sum
                  after the matching command.
                  Hi Felix,

                  To follow up issues related to the PSM in this thread, I would like to ask a question on the PSM procedure for a repeated cross-sectional data. Specifically, in a panel setup of two periods, say t1 and t2, I understand that covariates should be used in t1 while outcome should be in t2. But my setup is a repeated cross-sectional data of two years, say T1 and T2. For example, I want to estimate the effect of unemployment on individuals' health where Y represents an indicator of poor health (=1), U indicates unemployment status (=1) and X denotes a set of covariates. So my question is that how should I choose X? I don't know whether I should choose X in T1 (seem not make sense since this is not a panel data) or just pool the two year together and then use X in the pooled data and do not care whether X belong to T1 or T2.

                  Thank you.

                  Comment


                  • #10
                    I would argue that if t1 captures your pre-treatment status, you should match on these variables and then use the results (or the matching weights) when computing the outcome in t2. However, I never tried this myself and I would recommend looking in the literature for examples or further advice.
                    Best wishes

                    (Stata 16.1 MP)

                    Comment

                    Working...
                    X