Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Conditional logit model (clogit) to calculate propensity score for psmatch2

    Dear all,

    I am currently working on a panel data set of investment funds with monthly data (N>T) and would like to analyse the impact of a treatment on fund liquidity via propensity score matching (psmatch2). Given that I have fund-fixed effects I would like to use clogit to calculate the respective propensity scores.
    My model is:

    clogit treatment varlist, vce(cl ID) group(ID)
    predict r, pu0
    psmatch2 treatment, pscore(r) outcome(XLM_Diff_n1) common caliper(0.01)

    with treatment being the dummy of the treatment, varlist being the list of explanatory variables, ID being the fund identifier, r being the propensity score, and XLM_Diff_n1 being the proxy for fund liquidity.

    Yet, I do not know whether:

    a) I can actually use a conditional (fe) logit model to calculate the propensity scores - up to now I only have found information on psmatch2 with common logit and probit models

    b) the postestimation command predict (pu0) after using clogit would give me the right probabilities.

    Alternatively, I have tried to include the fixed effects by grouping the funds in larger buckets of similar characteristics but this leads to highly biased control groups in the subsequent matching process.

    I am grateful for any support on this matter, including alternative approaches to include fixed effects when calculating the propensity scores.

    Best,
    Friedrich

  • #2
    Friedrich,

    not my field of interest from a substantial point of view, but I am happy to share my thoughts about the estimation of the propensity score.

    In a model predicting which of the panels (funds, in this case) receives the treatment, we usually do not care much about unbiasedness of the estimates. That is why we do not even interpret the individual coefficients in these models. We are much more interested in finding a model that best predicts the outcome. I would argue that the probability of receiving a treatment is related to factors that vary between panels, but are constant within panels in many cases. A fixed-effects (within estimation) approach wipes out all the differences between panel units. This might be desirable if we are interested in causal inference of the estimated parameters of the model, but might not be so well to best predict the probability to receive a treatment. So my intuition here is to use all information that is available, i.e. use within and between panel variance in the estimation of the propensity score.

    Best
    Daniel

    Comment


    • #3
      Dear Daniel,

      thanks a lot for your very helpful thoughts. They made me realise that I have approached the issue from a wrong angle. However, as I am rather new to the field of PSM I would like to clarify one point:
      Do I understand you correctly that I can pretty much neglect omitted variable bias (e.g. due to neglecting fund fixed effects or other relevant independent variables) in my logit regression, because at this point I am not interedsted in the causal inference of individual estimated parameters of the model but rather in the overall probability of a fund receiving the treatment?
      So even if the coefficients in my logit model are biased due to omitted variables they still carry additional information that is useful for estimating a good propensity score.

      Best,
      Friedrich

      Comment


      • #4
        Friedrich,

        I am not an expert in this filed either, but I would answer your question with a yes and no. We can neglect omitted variables in so far, as we are not worried about biased estimates. However, the model should be "true" or correctly specified in so far, as the omitted variables do not affect assignment to the treatment.

        If you have access to Stata 13, I highly recommend looking into teffects. David Drukker gave a very interesting talk on teffects at this years German Stata User's Group Meeting (pdf).The two main points I remember are (i) we can generally distinguish between methods that model the outcome (e.g. regression adjustment) and those modelling the treatment (e.g. propensity score matching) and (ii) we can (and should) combine these methods. Drukker introduces the concept of "double robustness", meaning that in a combined model, causal inference is valid, as long as one of the two models is correctly specified.

        Best
        Daniel

        Comment

        Working...
        X