Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Taking data to first line of observations

    I'm using Stata 14 with Windows 10 OS.

    I want to perform a dif-in-dif nearest neighbor matching. The thing is I have more than one treatment episode (which I´m dealing separately). Nevertheless, I want to take several variables to first line of observations for each id, so I can do the match before the timing treatment.

    I´m sending below a pick of my data.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(id qdate MeanRecPre1 MeanRecPre2 MeanRecCr1 MeanRecCr2 MeanRecCr3 MeanRecPos1 MeanRecPos2 MeanRecPos3)
    12 168 .19730955         .        .         .         .         .        .         .
    12 170         .         . .1041995         .         .         .        .         .
    12 175         .         .        .         .         . .12063727        .         .
    12 178         .         .        . .13331863         .         .        .         .
    12 184         .         .        .         .         .         . .1194409         .
    12 188         . .13765518        .         .         .         .        .         .
    12 213         .         .        .         .  .1686768         .        .         .
    12 229         .         .        .         .         .         .        . .12797551
    13 168  .9611338         .        .         .         .         .        .         .
    13 170         .         . .6560199         .         .         .        .         .
    13 175         .         .        .         .         .  .7507268        .         .
    13 178         .         .        .  .4896258         .         .        .         .
    13 184         .         .        .         .         .         . .5349047         .
    13 188         .  .6013946        .         .         .         .        .         .
    13 213         .         .        .         . .53404635         .        .         .
    13 229         .         .        .         .         .         .        .  .3342379
    14 212         .  .6250144        .         .         .         .        .         .
    14 213         .         .        .         .  .5266255         .        .         .
    14 229         .         .        .         .         .         .        .  .4023241
    15 200         . .56327945        .         .         .         .        .         .
    15 213         .         .        .         .  .5713364         .        .         .
    15 229         .         .        .         .         .         .        .    .35276
    end
    format %tq qdate

  • #2
    Thank you for using -dataex-. I'm not entirely sure what result you are looking for. Your data are rather odd: it seems that each variable is observed only once per id, and the observations for different variables were made on dates which are sometimes separated by very large time intervals. I'm guessing that you want to create a single observation for each id that incorporates the one observed value for each variable. The following code verifies that each variable is, in fact, only observed once per id, and then it creates a single observation per id that incorporates those:

    Code:
    foreach v of varlist MeanRec* {
        display "verifying only one obs per id for `v'"
        by id (qdate), sort: egen n_obs = count(`v')
        assert n_obs <= 1
        drop n_obs
    }
    collapse (firstnm) MeanRec*, by(id)
    Do read -help collapse- to learn about this very important data management tool, one that every Stata user will employ frequently.

    That said, I wonder if it makes sense to combine data on these variables when their ascertainment dates, in at least some cases, are separated by as long as 61 quarters! Obviously it depends on the time-stability of these variables (and I don't know what it is they measure), and also on whether your intent is for the variables you match on to span different parts of a life cycle. Anyway, just raising it--maybe in context it's not a problem.

    Comment


    • #3
      Hey Clyde. Sorry took me so long to reply. Yes, my data looks odd. I'm taking averages conditional to several things. I'd like to thank you for the code. It works really nice.

      Comment

      Working...
      X