Regressions on each sample after propensity score matching

Lisa Wilson

Join Date: Aug 2016

Posts: 158
#1

Regressions on each sample after propensity score matching

06 Feb 2019, 03:38

Dear All
I have not used propensity score matching before. I read lots of materials but I am still confused. I will try to summarize my problem here.

I have a sample that includes firms which are likely to manipulate their earnings (manipulate=1) and other firms that are not suspected for manipulation (manipulate=0). The firms with manipulate=1 are much fewer than those with manipulate=1.

I want to do the following:
Determine a propensity score matched sample from the zero manipulation firm (with no replacement).

Run OLS regressions to determine the relation between stock prices and other financial ratios for the treated sample (manipulate=1) and its propensity score matched sample of manipulate=0.

Code:

**I did the following for step 1: Set seed 1234 // to ensure replication Gen sort_id= uniform() Sort sort_id psmatch2 manipulate x1 x2 x3 x4 x5, logit noreplace common tab _weight _treated **As for step 2: reg stock_price book earnings if _nn=1 reg stock_price book earnings if _nn==0

Is my execution for the first step correct?

I am not sure if one can use a conditional statement for the second step and I am also not sure if _nn is the correct variable here?

Note that the variables used in the logit to create the propensity scores are not the same as the ones in the OLS regression in the second step. My aim is first to create a matched sample based on firm characteristics in step 1 and then examine the relation between stock prices and financial variables for each sample (those with 1 manipulation and their matched sample of 0 manipulation)

I appreciate your help. psmatch2 is a well-known user written programme.

Thanks
Tags: None
Lisa Wilson

Join Date: Aug 2016

Posts: 158
#2

06 Feb 2019, 21:32

Dear all
I did not get any response on my post. Do you think I need to add any additional information to make it easier to get some responses?
Thanks
Comment
Lisa Wilson

Join Date: Aug 2016

Posts: 158
#3

07 Feb 2019, 02:34

In most research, psmatch2 is used with the outcome variable in the same command. However, in accounting research, we often use PSM to select a matching sample and then run regressions (for example stock valuation regression as above) for both the treated sample (of manipulation here) and the matched sample.
My problem is that I am not sure if a conditional if statement is how this is should be done? In addition, I am not sure if I run the first step above properly?

I hope someone can help
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#4

07 Feb 2019, 11:26

You'll increase your chances of a helpful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

Rather than doing this manually, I would look at the treatment effects documentation. There are some important assumptions made in your approach - what some call selection on the observables. The manual will help you with both understanding the approach and programming it.
Comment
Lisa Wilson

Join Date: Aug 2016

Posts: 158
#5

07 Feb 2019, 19:30

Hi
I have actually followed the FAQs. I am happy to use dataex for my data. However, my question is mainly about the approach through which one should carry one with regressions after psmatch2. I hope to hear back from you and all participants.
Comment
Lisa Wilson

Join Date: Aug 2016

Posts: 158
#6

11 Feb 2019, 00:19

Hi all
I thought to bring this up again if anyone in the forum can provide some help. It is still not resolved.
Thanks
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10168
#7

11 Feb 2019, 15:33

psmatch2 is from SSC, you are asked to explain. I do not use the command but given that you failed to take into account Phil's advice, I can expand a bit.

I have not used propensity score matching before. I read lots of materials but I am still confused. I will try to summarize my problem here.

I have a sample that includes firms which are likely to manipulate their earnings (manipulate=1) and other firms that are not suspected for manipulation (manipulate=0). The firms with manipulate=1 are much fewer than those with manipulate=1.

It appears that you misunderstand the point of propensity score matching (PSM). The fact that your treatment group is larger than your control group or vice-versa is not a reason to employ PSM to reduce the size of the larger group. As a matter of fact, dropping observations for no good reason is not desirable from a statistical point of view and may ultimately bias your estimates. You will usually want as large a sample as possible. The idea behind PSM is that comparison between samples can only be made if individuals/ firms that make up both samples have similar characteristics. So the first step is to compare the unmatched samples using the summarize command. If you discover significant differences, then you proceed to PSM to match the two samples. Here, you need to know what matching method is appropriate, e.g., nearest neighbors or exact matching or optimal matching, etc.

I am not sure ... if _nn is the correct variable here?

So your second question relates to whether you have properly identified the matched samples after PSM. How can you determine this? Again, this is simple. Just compare the matched samples using the summarize command. You should not find significant differences as was the case for the unmatched samples. Again, as Phil advised, go through the reference manual to make sure that you understand what you are doing or talk to someone more knowledgeable (a supervisor or an experienced colleague).

Last edited by Andrew Musau; 11 Feb 2019, 16:06.
Comment
Sabrina Muller

Join Date: Jan 2019

Posts: 21
#8

12 Feb 2019, 14:34

Hi Lisa, did you get any further in your research? I have a similar task and also don't know how to proceed exactly, since most material I find just does the "actual" regression and the propensity score matching in one single command?
Comment
Lisa Wilson

Join Date: Aug 2016

Posts: 158
#9

15 Feb 2019, 00:32

Unfortunately, I still did not get any response on this. I hope someone can provide some help.
In accounting research, most papers do this in two steps. In step one a logistics regression is run to estimate propensity scores, and then a matching sample is slected (using alternative approaches). In step two the regression of interest is estimated for the treatment sample and the propensity score matching sample.
I can not find any codes online that does something similar.

I hope I get some response.
1 like
Comment
Ivana Rozic

Join Date: Feb 2019

Posts: 17
#10

19 Feb 2020, 14:19

Maybe this helps https://www.ssc.wisc.edu/sscc/pubs/stata_psmatch.htm
Comment

Announcement

Regressions on each sample after propensity score matching

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment