Taking data to first line of observations

Felipe Damasceno

Join Date: Feb 2016
Posts: 120

Taking data to first line of observations

12 Sep 2018, 10:25

I'm using Stata 14 with Windows 10 OS.

I want to perform a dif-in-dif nearest neighbor matching. The thing is I have more than one treatment episode (which I´m dealing separately). Nevertheless, I want to take several variables to first line of observations for each id, so I can do the match before the timing treatment.

I´m sending below a pick of my data.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(id qdate MeanRecPre1 MeanRecPre2 MeanRecCr1 MeanRecCr2 MeanRecCr3 MeanRecPos1 MeanRecPos2 MeanRecPos3)
12 168 .19730955         .        .         .         .         .        .         .
12 170         .         . .1041995         .         .         .        .         .
12 175         .         .        .         .         . .12063727        .         .
12 178         .         .        . .13331863         .         .        .         .
12 184         .         .        .         .         .         . .1194409         .
12 188         . .13765518        .         .         .         .        .         .
12 213         .         .        .         .  .1686768         .        .         .
12 229         .         .        .         .         .         .        . .12797551
13 168  .9611338         .        .         .         .         .        .         .
13 170         .         . .6560199         .         .         .        .         .
13 175         .         .        .         .         .  .7507268        .         .
13 178         .         .        .  .4896258         .         .        .         .
13 184         .         .        .         .         .         . .5349047         .
13 188         .  .6013946        .         .         .         .        .         .
13 213         .         .        .         . .53404635         .        .         .
13 229         .         .        .         .         .         .        .  .3342379
14 212         .  .6250144        .         .         .         .        .         .
14 213         .         .        .         .  .5266255         .        .         .
14 229         .         .        .         .         .         .        .  .4023241
15 200         . .56327945        .         .         .         .        .         .
15 213         .         .        .         .  .5713364         .        .         .
15 229         .         .        .         .         .         .        .    .35276
end
format %tq qdate

Tags: data, panel, panel data

Clyde Schechter

Join Date: Apr 2014

Posts: 30122
#2

12 Sep 2018, 10:47

Thank you for using -dataex-. I'm not entirely sure what result you are looking for. Your data are rather odd: it seems that each variable is observed only once per id, and the observations for different variables were made on dates which are sometimes separated by very large time intervals. I'm guessing that you want to create a single observation for each id that incorporates the one observed value for each variable. The following code verifies that each variable is, in fact, only observed once per id, and then it creates a single observation per id that incorporates those:

Code:

foreach v of varlist MeanRec* { display "verifying only one obs per id for `v'" by id (qdate), sort: egen n_obs = count(`v') assert n_obs <= 1 drop n_obs } collapse (firstnm) MeanRec*, by(id)

Do read -help collapse- to learn about this very important data management tool, one that every Stata user will employ frequently.

That said, I wonder if it makes sense to combine data on these variables when their ascertainment dates, in at least some cases, are separated by as long as 61 quarters! Obviously it depends on the time-stability of these variables (and I don't know what it is they measure), and also on whether your intent is for the variables you match on to span different parts of a life cycle. Anyway, just raising it--maybe in context it's not a problem.
Comment
Felipe Damasceno

Join Date: Feb 2016

Posts: 120
#3

24 Sep 2018, 06:05

Hey Clyde. Sorry took me so long to reply. Yes, my data looks odd. I'm taking averages conditional to several things. I'd like to thank you for the code. It works really nice.
Comment

Announcement

Taking data to first line of observations

Comment

Comment