Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can we add control variables when using the teffects psmatch command?

    Dear all,

    I'm using the teffects psmatch command to examine the average treatment effect on the treated. I used a list of covariates which will be use to perform the matching between the treatment and control groups. My question is that when using this command, in addition to specifying the covariates influencing the treatment variables, and the potential outcomes, how can I include additional control variables that are not used in the matching? Based on the command specification, it does not seem to have a place for control variables that do not influence the propensity to be included in the treatment groups but that I would like to control for when computing the treatment effects on the potential outcome variables.

    For your advice please. thanks

  • #2
    From a theoretical perspective, why would you want to "control" for something that does not affect treatment? Why would the treatment effect be affected by such a factor? In the framework of propensity score matching this makes little sense.

    You could use one of the doubly robust estimators and specify a model for the outcome and one model for the treatment.

    Best
    Daniel

    Comment


    • #3
      Hi Daniel,

      Many thanks for the response. I appreciate it. To illustrate the issue for your kind consideration, consider the effects of seasonality on sales. The potential outcome is weekly sales and the treatment is the launch of a subscription program by the company. We could control for seasonality by including a column to indicate the week (e.g. week 4) in which the sales were made. In this case, seasonality influences the dependent variable in terms of weekly sales (peak periods are likely to generate higher sales) but it does not influence the propensity for a consumer to subscribe to the service as a member. Does this make sense?

      Best,
      FL

      Comment


      • #4
        In this case, seasonality influences the dependent variable in terms of weekly sales (peak periods are likely to generate higher sales) but it does not influence the propensity for a consumer to subscribe to the service as a member. Does this make sense?
        I think I understand, but this is my point. If the propensity of subscription is not affected then it is not necessary to "control" for seasonality. As long as the exogeneity (ignorability) assumption holds, there is no problem in identifying the causal effect. Covariates that we need to control for must be correlated with the outcome and the predictors.

        the treatment is the launch of a subscription program by the company
        Without knowing much about your design and research questions, I think I would be much more worried about the motivation of those companies, which might well be endogenous, than about seasonality, that arguably affects all units in the same (random?) way. Maybe seasonality could also be a factor affecting a company's decision about the timing of launching their subscription program. In that case you should probably include it in the matching equation.

        Best
        Daniel
        Last edited by daniel klein; 18 Jan 2017, 22:13.

        Comment


        • #5
          Hi Daniel,

          Thanks very much for the feedback. Indeed a valid point. I'm new to propensity score matching and have been using the effects psmatch command. I was wondering if you might have some experience in using this. I know that it is important to get a good balance between the treatment and the control group and the potential outcomes are very sensitive to the way they are being balanced. However, I realised that I have very little control over the balancing outcomes (measured in standardised difference and variance) except by adjusting the covariates. Even after using prior theory to come up with the set of covariates, I realised that the balancing results are not very good (in terms of difference close to zero and variance close to 1) and the running the t-test between the two groups usually still give me significant difference between them. I was wondering if you might be able to provide some guidance on how I should proceed with this. Thanks in advance.

          Best,
          Fred

          Comment


          • #6
            Actually, I have not used matching much myself. But I do not believe there are a lot of technical tricks to learn and apply here. The philosophy behind matching is rather rigorous. If you cannot get the samples balanced then you simply did not observe data that is suitable to answer your question and estimate a causal effect. See, the hole (semi-parametric) matching stuff is really about emphasizing that the conclusions you draw should be rooted in the data you observe. Hence the concepts of balancing and common support. Within this framework then, the (unsatisfying) answer to your question is: "I cannot tell.". From this perspective it is the honest answer.

            What I found a bit disturbing about the matching approach is that by focusing so much on the assignment mechanism to treatment, you easily lose focus on the data generating mechanism of the outcome. The latter is usually the main interest in regression type approaches, where you typically do not even discuss common support. Given a "true" model, this is not a problem as the parametric assumptions that you are making allow you to extrapolate beyond the given sample.

            If your theory tells you how the outcome is created and how treatment is assigned, I would use a combined approach, specifying the two models, as stated before. If you insist on PSM for whatever reasons then I think you need to interpret results you obtain within this framework.

            [Edit]
            In my opinion what you should never do is to use matching in a first step, then use the variables that are not balanced in a second step as controls added to a regression model. I have seen this being suggested more than once but given the discussion above, I believe such ad-hoc approach of mixing two pretty distinct statistical philosophies is flawed from a theoretical perspective. Again, it is then preferable to use a method that combines the treatment and outcome model in a more elaborated way.
            [/Edit]

            A small note on the t-test after matching. David Drukker from StataCorp once explained that the test is invalid, because it does not take the estimated nature of the propensity score in the first step into account. Personally I think it is still useful as a heuristic.

            Best
            Daniel
            Last edited by daniel klein; 20 Jan 2017, 01:17.

            Comment


            • #7
              Hi Daniel,

              Thanks for the very helpful advice. Over the weekend, I have been reading around the matching methods. I found out that Coarsened Exact Matching might be a better method in matching the treatment and control groups based on a user pre-defined coarsened data and then estimate the effects based on the uncoarsened data. I tried this approach and it seems to give a set of results that I'm more comfortable with.

              On a separate note, I was wondering if you have any experience dealing with results where the reported standard errors appears to be on the high side, for example, 2.25. Thanks

              Best,
              Fred

              Comment


              • #8
                Originally posted by frederick lim View Post
                I found out that Coarsened Exact Matching might be a better method in matching the treatment and control groups based on a user pre-defined coarsened data and then estimate the effects based on the uncoarsened data.
                Have not heard of this before. It seems this method has some advantages compared to others. Thanks for pointing it out.

                On a separate note, I was wondering if you have any experience dealing with results where the reported standard errors appears to be on the high side, for example, 2.25. Thanks
                Sorry, I do not really understand which standard error you are referring to. Anyway the size of a standard error can only be judged by comparing it to the respective point estimate. The number 2.25 might be deemed high when the point estimate is around 1, but tiny when the point estimate is around 100.

                Best
                Daniel

                Comment

                Working...
                X