Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Two questions

    Hello, I have two questions regarding the data set below. First I have two data sets, one that shows dates of crimes committed before a treatment effect and crimes after the treatment effect. I needed to combine these two data sets together. Their common variables was person_id. The problem I run into is the way the variable person_ID was coded.For example there is a person in this data set that was arrested at three distinct dates, he is given the ID 1, so this person appears as 1 three times in the data set. In the second dataset this individual was arrested twice so in the merged data set 1 should appear 5 times. When I try the command merge is fails to merge in any of the possible combinations. I don't know how to combine the data sets such that each of the 5 observations are recorded. I tried joining these two data sets using the joinby command but got tons of duplicates as can be seen by this sample code.

    The second question I have is that I need to create a dummy variable that takes the value 1 if the persons arrest date was later than their disposition date. I'm not sure how to do this

    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int person_id str10 pastarrests_date long caseid str10(arrest_date dispos_date) byte treat
    1 "2008-06-14" 88413 "2013-04-04" "2013-06-22" 0
    1 "2008-06-14" 39970 "2012-07-11" "2012-10-20" 1
    1 "2008-06-14" 57514 "2012-01-04" "2012-03-27" 0
    1 "2010-04-09" 39970 "2012-07-11" "2012-10-20" 1
    1 "2010-04-09" 57514 "2012-01-04" "2012-03-27" 0
    1 "2010-04-09" 88413 "2013-04-04" "2013-06-22" 0
    5 "2009-06-07" 40216 "2012-03-31" "2013-03-25" 0
    5 "2010-05-20" 40216 "2012-03-31" "2013-03-25" 0
    6 "2008-03-16" 92255 "2012-12-09" "2013-11-09" 0
    6 "2009-01-20" 92255 "2012-12-09" "2013-11-09" 0
    6 "2009-09-18" 92255 "2012-12-09" "2013-11-09" 0
    8 "2009-08-05" 2913 "2012-10-06" "2013-12-29" 1
    8 "2009-08-05" 6304 "2013-04-06" "2013-07-07" 0
    8 "2010-06-18" 6304 "2013-04-06" "2013-07-07" 0
    8 "2010-06-18" 2913 "2012-10-06" "2013-12-29" 1
    8 "2011-05-06" 6304 "2013-04-06" "2013-07-07" 0
    8 "2011-05-06" 2913 "2012-10-06" "2013-12-29" 1
    8 "2011-06-17" 2913 "2012-10-06" "2013-12-29" 1
    8 "2011-06-17" 6304 "2013-04-06" "2013-07-07" 0
    9 "2011-02-06" 31881 "2013-09-25" "2013-12-29" 1
    9 "2011-02-06" 82277 "2012-01-12" "2012-11-08" 0
    10 "2009-06-03" 11354 "2013-07-03" "2014-12-13" 1
    10 "2009-08-15" 11354 "2013-07-03" "2014-12-13" 1
    10 "2010-10-29" 11354 "2013-07-03" "2014-12-13" 1
    11 "2008-12-16" 49941 "2013-03-17" "2013-05-11" 1
    11 "2011-01-31" 49941 "2013-03-17" "2013-05-11" 1
    11 "2011-05-09" 49941 "2013-03-17" "2013-05-11" 1
    12 "2009-06-03" 59616 "2013-06-25" "2013-08-23" 0
    13 "2008-05-16" 40377 "2012-11-04" "2013-03-05" 1
    13 "2008-05-16" 88153 "2012-10-29" "2013-01-21" 1
    13 "2008-05-16" 39447 "2013-07-31" "2013-08-16" 1
    13 "2008-09-05" 88153 "2012-10-29" "2013-01-21" 1
    13 "2008-09-05" 39447 "2013-07-31" "2013-08-16" 1
    13 "2008-09-05" 40377 "2012-11-04" "2013-03-05" 1
    13 "2009-09-09" 39447 "2013-07-31" "2013-08-16" 1
    13 "2009-09-09" 88153 "2012-10-29" "2013-01-21" 1
    13 "2009-09-09" 40377 "2012-11-04" "2013-03-05" 1
    13 "2010-02-21" 40377 "2012-11-04" "2013-03-05" 1
    13 "2010-02-21" 39447 "2013-07-31" "2013-08-16" 1
    13 "2010-02-21" 88153 "2012-10-29" "2013-01-21" 1
    13 "2010-12-28" 88153 "2012-10-29" "2013-01-21" 1
    13 "2010-12-28" 40377 "2012-11-04" "2013-03-05" 1
    13 "2010-12-28" 39447 "2013-07-31" "2013-08-16" 1
    13 "2011-04-16" 88153 "2012-10-29" "2013-01-21" 1
    13 "2011-04-16" 40377 "2012-11-04" "2013-03-05" 1
    13 "2011-04-16" 39447 "2013-07-31" "2013-08-16" 1
    13 "2011-08-07" 39447 "2013-07-31" "2013-08-16" 1
    13 "2011-08-07" 88153 "2012-10-29" "2013-01-21" 1
    13 "2011-08-07" 40377 "2012-11-04" "2013-03-05" 1
    13 "2011-10-26" 39447 "2013-07-31" "2013-08-16" 1
    13 "2011-10-26" 40377 "2012-11-04" "2013-03-05" 1
    13 "2011-10-26" 88153 "2012-10-29" "2013-01-21" 1
    14 "2010-01-01" 56468 "2013-04-05" "2013-10-09" 1
    14 "2010-09-28" 56468 "2013-04-05" "2013-10-09" 1
    15 "2008-01-22" 96126 "2012-06-13" "2012-08-15" 0
    15 "2008-01-22" 19842 "2012-03-31" "2013-09-08" 1
    15 "2008-11-30" 19842 "2012-03-31" "2013-09-08" 1
    15 "2008-11-30" 96126 "2012-06-13" "2012-08-15" 0
    15 "2008-12-31" 19842 "2012-03-31" "2013-09-08" 1
    15 "2008-12-31" 96126 "2012-06-13" "2012-08-15" 0
    15 "2009-02-06" 19842 "2012-03-31" "2013-09-08" 1
    15 "2009-02-06" 96126 "2012-06-13" "2012-08-15" 0
    15 "2010-12-01" 96126 "2012-06-13" "2012-08-15" 0
    15 "2010-12-01" 19842 "2012-03-31" "2013-09-08" 1
    16 "2008-01-25" 95551 "2012-05-23" "2012-08-02" 1
    16 "2008-05-30" 95551 "2012-05-23" "2012-08-02" 1
    16 "2008-08-23" 95551 "2012-05-23" "2012-08-02" 1
    16 "2009-12-11" 95551 "2012-05-23" "2012-08-02" 1
    17 "2009-06-01" 97880 "2012-05-22" "2012-06-15" 1
    17 "2011-03-17" 97880 "2012-05-22" "2012-06-15" 1
    17 "2011-09-30" 97880 "2012-05-22" "2012-06-15" 1
    18 "2008-10-06" 26780 "2012-04-05" "2013-01-26" 1
    18 "2008-10-06" 37756 "2012-01-10" "2013-06-03" 1
    18 "2008-10-06" 57949 "2012-04-18" "2012-05-27" 0
    18 "2008-12-25" 26780 "2012-04-05" "2013-01-26" 1
    18 "2008-12-25" 37756 "2012-01-10" "2013-06-03" 1
    18 "2008-12-25" 57949 "2012-04-18" "2012-05-27" 0
    18 "2009-08-07" 37756 "2012-01-10" "2013-06-03" 1
    18 "2009-08-07" 26780 "2012-04-05" "2013-01-26" 1
    18 "2009-08-07" 57949 "2012-04-18" "2012-05-27" 0
    18 "2009-11-02" 37756 "2012-01-10" "2013-06-03" 1
    18 "2009-11-02" 26780 "2012-04-05" "2013-01-26" 1
    18 "2009-11-02" 57949 "2012-04-18" "2012-05-27" 0
    18 "2009-11-13" 26780 "2012-04-05" "2013-01-26" 1
    18 "2009-11-13" 37756 "2012-01-10" "2013-06-03" 1
    18 "2009-11-13" 57949 "2012-04-18" "2012-05-27" 0
    18 "2009-11-22" 57949 "2012-04-18" "2012-05-27" 0
    18 "2009-11-22" 26780 "2012-04-05" "2013-01-26" 1
    18 "2009-11-22" 37756 "2012-01-10" "2013-06-03" 1
    18 "2010-01-08" 57949 "2012-04-18" "2012-05-27" 0
    18 "2010-01-08" 37756 "2012-01-10" "2013-06-03" 1
    18 "2010-01-08" 26780 "2012-04-05" "2013-01-26" 1
    18 "2010-05-02" 37756 "2012-01-10" "2013-06-03" 1
    18 "2010-05-02" 26780 "2012-04-05" "2013-01-26" 1
    18 "2010-05-02" 57949 "2012-04-18" "2012-05-27" 0
    19 "2008-02-16" 39548 "2013-09-20" "2014-08-29" 0
    19 "2008-02-16" 62439 "2013-07-08" "2013-09-16" 1
    24 "2008-06-16" 5690 "2012-12-25" "2013-01-07" 1
    24 "2009-08-02" 5690 "2012-12-25" "2013-01-07" 1
    24 "2009-11-16" 5690 "2012-12-25" "2013-01-07" 1
    end
    [/CODE]

  • #2
    What is needed is a sample of your data from each of the two datasets you want to combine. For each dataset do
    Code:
    dataex if person_id<=10
    and post the results. Be sure to copy and paste everything in each example including the opening [CODE] and the closing [/CODE].

    Comment


    • #3
      The first line of code comes from the master data set which consists of arrests after a treatment affect was initiated. The second data set is previous arrests before a treatment effect was initiated. The common variable is person_ID, I try dropping duplicates before merging but I still can't use the merge command.

      CODE]
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long caseid int person_id str10(arrest_date dispos_date) byte treat str8 race str1 gender str10 bdate
      39970 1 "2012-07-11" "2012-10-20" 1 "HISPANIC" "F" "1985-07-03"
      57514 1 "2012-01-04" "2012-03-27" 0 "HISPANIC" "F" "1985-07-03"
      88413 1 "2013-04-04" "2013-06-22" 0 "HISPANIC" "F" "1985-07-03"
      40216 5 "2012-03-31" "2013-03-25" 0 "BLACK" "M" "1986-09-27"
      92255 6 "2012-12-09" "2013-11-09" 0 "BLACK" "M" "1991-06-07"
      26516 7 "2012-02-25" "2012-03-26" 0 "HISPANIC" "F" "1994-08-24"
      6304 8 "2013-04-06" "2013-07-07" 0 "BLACK" "M" "1978-04-04"
      2913 8 "2012-10-06" "2013-12-29" 1 "BLACK" "M" "1978-04-04"
      82277 9 "2012-01-12" "2012-11-08" 0 "ASIAN" "F" "1994-07-11"
      31881 9 "2013-09-25" "2013-12-29" 1 "ASIAN" "F" "1994-07-11"
      11354 10 "2013-07-03" "2014-12-13" 1 "HISPANIC" "M" "1984-12-30"
      end






      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int person_id str10 arrest_date
       1 "2008-06-14"
       1 "2010-04-09"
       5 "2009-06-07"
       5 "2010-05-20"
       6 "2008-03-16"
       6 "2009-01-20"
       6 "2009-09-18"
       8 "2009-08-05"
       8 "2010-06-18"
       8 "2011-05-06"
       8 "2011-06-17"
       9 "2011-02-06"
      10 "2009-06-03"
      10 "2009-08-15"
      10 "2010-10-29"
      end

      Comment


      • #4
        Thank you for the example data. That makes it clear that you want neither merge nor joinby but instead append. Below is some example code that I think does what you want, both for combining the datasets and for generating the indicator variable.

        But first, a piece of advice. Your dates were given as strings, rather than Stata Internal Format dates. That makes me think you are new to Stata, or at least, new to working with dates and times in Stata. Stata's "date and time" variables are complicated and there is a lot to learn. If you have not already read the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF, do so now. If you have, it's time for a refresher. After that, the help datetime documentation will usually be enough to point the way. You can't remember everything; even the most experienced users end up referring to the help datetime documentation or back to the manual for details. But at least you will get a good understanding of the basics and the underlying principles. An investment of time that will be amply repaid.

        All Stata manuals are included as PDFs in the Stata installation (since version 11) and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu.

        With that said ...
        Code:
        clear
        // replace `before' and `after' with the names of your datasets
        append using `before' `after'
        
        // convert string dates to Stata Internal Format dates
        // see the output of help datetime
        rename (arrest_date dispos_date bdate) (ad dd bd)
        generate arrest_date = daily(ad,"YMD")
        generate dispos_date = daily(dd,"YMD")
        generate bdate       = daily(bd,"YMD")
        format %td arrest_date dispos_date bdate
        drop ad dd bd 
        order bdate arrest_date dispos_date, after(person_id)
        
        // indicator of late arrest
        generate late_arr = arrest_date > dispos_date if dispos_date!=.
        order late_arr, after(dispos_date)
        
        sort person_id arrest_date
        list, sepby(person_id) noobs
        Code:
        . list, sepby(person_id) noobs
        
          +----------------------------------------------------------------------------------------------+
          | person~d       bdate   arrest_~e   dispos_~e   late_arr   caseid   treat       race   gender |
          |----------------------------------------------------------------------------------------------|
          |        1           .   14jun2008           .          .        .       .                     |
          |        1           .   09apr2010           .          .        .       .                     |
          |        1   03jul1985   04jan2012   27mar2012          0    57514       0   HISPANIC        F |
          |        1   03jul1985   11jul2012   20oct2012          0    39970       1   HISPANIC        F |
          |        1   03jul1985   04apr2013   22jun2013          0    88413       0   HISPANIC        F |
          |----------------------------------------------------------------------------------------------|
          |        5           .   07jun2009           .          .        .       .                     |
          |        5           .   20may2010           .          .        .       .                     |
          |        5   27sep1986   31mar2012   25mar2013          0    40216       0      BLACK        M |
          |----------------------------------------------------------------------------------------------|
          |        6           .   16mar2008           .          .        .       .                     |
          |        6           .   20jan2009           .          .        .       .                     |
          |        6           .   18sep2009           .          .        .       .                     |
          |        6   07jun1991   09dec2012   09nov2013          0    92255       0      BLACK        M |
          |----------------------------------------------------------------------------------------------|
          |        7   24aug1994   25feb2012   26mar2012          0    26516       0   HISPANIC        F |
          |----------------------------------------------------------------------------------------------|
          |        8           .   05aug2009           .          .        .       .                     |
          |        8           .   18jun2010           .          .        .       .                     |
          |        8           .   06may2011           .          .        .       .                     |
          |        8           .   17jun2011           .          .        .       .                     |
          |        8   04apr1978   06oct2012   29dec2013          0     2913       1      BLACK        M |
          |        8   04apr1978   06apr2013   07jul2013          0     6304       0      BLACK        M |
          |----------------------------------------------------------------------------------------------|
          |        9           .   06feb2011           .          .        .       .                     |
          |        9   11jul1994   12jan2012   08nov2012          0    82277       0      ASIAN        F |
          |        9   11jul1994   25sep2013   29dec2013          0    31881       1      ASIAN        F |
          |----------------------------------------------------------------------------------------------|
          |       10           .   03jun2009           .          .        .       .                     |
          |       10           .   15aug2009           .          .        .       .                     |
          |       10           .   29oct2010           .          .        .       .                     |
          |       10   30dec1984   03jul2013   13dec2014          0    11354       1   HISPANIC        M |
          +----------------------------------------------------------------------------------------------+

        Comment

        Working...
        X