Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Copy value that results in smallest difference (dates)

    Hello,

    I am having an issue with what I believe is an easy fix, but I can not figure it out.
    • I have two variables: GSGiftDate and ExclDate.
    • I want to generate a new variable, called closest_date, which is just a copy of the GSGiftDate closest to the ExclDate. (There are multiple GSGiftDates per person).
    I tried using the command: by Pidm, sort: egen closest_date = min(abs(GSGiftDate-ExclDate)). However this did not return what I wanted.

    Below is an example of the data, but dates are appearing as integers (I apologize, I do not know how to show dataex output with the dates instead of the integers.)

    Thank you!

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long Pidm int(GSGiftDate ExclDate)
    70004818 19791 19834
    70005329     . 20933
    70005329 14943     .
    70006439     . 19705
    70006926     . 21343
    70007215     . 19019
    70007865     . 20727
    70007865     . 20810
    70017704 19744 19758
    70017704 20600     .
    70017704 20229     .
    70025931     . 19758
    70043515     . 19019
    70043515 20163     .
    70043515 16141     .
    70043515 15609     .
    70044093     . 19019
    70044093 17969     .
    70044093 15620     .
    70044093 17166     .
    end
    format %td GSGiftDate
    format %td ExclDate

  • #2
    I think this gets you what you want:

    Code:
    preserve
    keep Pidm GSGiftDate
    rename GSGiftDate wanted
    duplicates drop
    tempfile candidates
    save `candidates'
    
    restore
    joinby Pidm using `candidates'
    gen difference = abs(ExclDate - wanted)
    by Pidm GSGiftDate ExclDate (difference), sort: keep if _n == 1
    Note that there are some observations in your data where a firm has no non-missing values for GSGiftDate. It follows that wanted will also be missing for all observations for that firm.

    Comment


    • #3
      I took a different tack from Clyde but get the same results except where there are missing values. You should inspect both to determine which is appropriate for your needs.

      Code:
      bysort Pidm: egen newExcelDate=min(ExclDate)
      format %td newExcelDate
      gen no_of_days=abs(newExcelDate-GSGiftDate)
      bysort Pidm (no_of_days): gen expected= GSGiftDate[1]
      format %td expected
      list, sepby(Pidm)
      Stata/MP 14.1 (64-bit x86-64)
      Revision 19 May 2016
      Win 8.1

      Comment


      • #4
        Thank you both Clyde Schechter Carole J. Wilson , it seems that your method worked a little better for what I needed Carole, as it copied the closest date in the new variable every time the Pidm repeated. Appreciate all the help!

        Comment

        Working...
        X