Gen match =_n1 replace match = _id if match ==. How can you have missing value ??

Tara Boyle

Join Date: Nov 2022

Posts: 145
#1

Gen match =_n1 replace match = _id if match ==. How can you have missing value ??

28 Feb 2023, 22:28

Richard Hofler can you help? As I believe you were the author of this code
re post 5 on here https://www.statalist.org/forums/for...score-matching

i am trying to understand this code which the user recommends using after psmatch2 to plot to kdensity curves to identify control when treatment = 0 and dup >1.

Code:

gen match=_n1 replace match=_id if match==.

i understand that a new variable called match =_n1 where STATA gives value of 1 for 1st observation, 2 for 2nd and so on. So if there are 400 observations the last onservation should be 400.

therefore why should there be a missing value ?
why does the user say if match =. Replace with _id.
how can there be a missing value if the match variable generated is = _n1

the user then goes on to use the code below to identify the duolicates and therefor identify the matched (i think)

Code:

duplicates tag match, gen(dup)

i would like to know why the user uses this code as part of his controls to plot a kdensity graph. Its 4am and i’m still thinking about this. I have asked chatgpt who also didnt give me a solution. And emailed several people about this. Appreciate some insignt.
Tags: None
Tara Boyle

Join Date: Nov 2022

Posts: 145
#2

01 Mar 2023, 05:16

Nick Cox
Hi I just wanted to add this, that I answered this question myself after many sleepless nights.

So:

Code:

gen match=_n1

- This generates a variable match that becomes = to _n1 _n1 = is the number of the observation matched to that treatment. If a observation was not matched it remains as missing

Code:

replace match=_id if match==.

Within this step, the missing observations within match become = to id _id - is the unique observation created after psmatch2

Code:

duplicates tag match, gen(dup2)

Stata scans the match variable and looks for duplicates. If an observation was used for matching then the value for duplicates should be > 0. Therefore it has been used in the matching (psmatch2)

So this is the reason why post 5 on here https://www.statalist.org/forums/for...score-matching keeps those values were duplicates >0 in the control group (as they have been used for matching)

However a question for the experts - why for _id == 1 , which is untreated and has been used as _n2 match for _id 11.

With the code supplied above, this has been based on _n1 and therefore for dup2 = 0 .
Id == 1 should be included as part of the matched values controls.

This isn't taken into consideration when plotting kdensity graphs of p scores for the untreated controls (in the above)
Does this mean I need to change the code to include _n1 _n2 _n3 ?

only see the solution as this code: Dropping those that have not been weighted as these have not been used in the matching process. ANy other alternatives? Or professional kind opinions?

Code:

drop if _weight == .

OR:

Code:

psmatch2 treatment_var, outcome(outcome_var) logit ate neighbor(1) caliper(0.25) weight(weight_var) n(3) gen(match_id) twoway (kdensity propensity_score if treatment_var==1 [weight=weight_var], lc(red) lpattern(shortdash)) /// (kdensity propensity_score if treatment_var==0 [weight=weight_var], lc(blue) lpattern(dash)), /// legend(label(1 "Treatment Group") label(2 "Control Group")) ytitle(Density) xtitle(Propensity Score) title("Kernel Density Plot of Propensity Scores")

Last edited by Tara Boyle; 01 Mar 2023, 05:19.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 36058
#3

01 Mar 2023, 05:24

Not sure why you're pinging me here, but I am pleased you solved your problem. I've never used propensity score matching. ever, so I don't try to answer questions on it, and can't contribute to discussion of the points you raise.

Last edited by Nick Cox; 01 Mar 2023, 05:30.
Comment
Tara Boyle

Join Date: Nov 2022

Posts: 145
#4

01 Mar 2023, 06:23

Ah, pity as there are a lot of unanswered questions on propensity scores, perhaps David Radwin can help ?
Perhaps I can help answer in the future once I get a better understanding.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 36058
#5

01 Mar 2023, 07:07

I see. I guess it's obvious, but what is true for me is true generally. People answering questions here are all volunteers and neither individually nor collectively can we cover the entire field of Stata or Stata applications. Crucially, there is no mechanism to ensure that all questions are answered.

There are many people interested in propensity score matching, but in your case you're asking about a community-contributed command, and the authors don't appear to be active here. That does not rule out answers but it makes them less likely. Besides, the attitudes of authors to their software can fairly vary. Some take the line that they're happy if you find their work useful, but sorry, they don't have the time to guide you through your project that uses it.

The caprice of what gets answered also depends on time of day as people in different time zones get to look at their computers in odd moments.
Comment
Tara Boyle

Join Date: Nov 2022

Posts: 145
#6

01 Mar 2023, 07:53

Yes i’ve tried emailing the authors of the code. I’ll wait .
many thanks.
Comment
David Radwin

Join Date: Mar 2014

Posts: 371
#7

01 Mar 2023, 11:33

I'm sorry I can't help. Please see https://www.statalist.org/forums/for...93#post1513193.

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him
Comment
Tara Boyle

Join Date: Nov 2022

Posts: 145
#8

01 Mar 2023, 11:41

Seen that, got the book, still didnt help haha!
Comment
Tara Boyle

Join Date: Nov 2022

Posts: 145
#9

01 Mar 2023, 14:03

George Ford can you pethaps help?
Comment

Announcement

Gen match =_n1 replace match = _id if match ==. How can you have missing value ??

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment