Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging two datasets with different dateformat

    date Mkt-RF SMB HML RF
    20150701 0.61 -0.77 -0.03 0.000
    20150702 -0.11 -0.57 -0.10 0.000
    20150703 0.62 -0.37 -0.17 0.000
    DATES CGJAX CGJIX CFJAX
    07-01-15
    07-02-15 -0,00051 0 -0,00051
    07-06-15 -0,00253 -0,00304 -0,00663

    How do I exactly merge these two datasets with different date formats?

    Would appreciate the stata command code.

    Best,
    T.

  • #2
    Your data display does not provide enough information about your data to fully answer your question. It is impossible to discern several things:
    1. Is the variable date in the first example a string variable or is it a long integer numeric variable?
    2. Does the -15 appearing in dates in the second example mean year 2015?
    3. What is the order of month and day in both examples? It is impossible to know which is which?
    In any case, the solution is to convert everything to real Stata numeric date variables. To illustrate, I will assume that both date and dates are string variables. I will also assume that the 07's in both cases are the months, that -15 does mean 2015, and that the remaining digits are the day of the month. You will need to adapt the code to your actual situation if my guesses are wrong.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str8 date float(mktrf smb hml) byte rf
    "20150701"  .61 -.77 -.03 0
    "20150702" -.11 -.57  -.1 0
    "20150703"  .62 -.37 -.17 0
    end
    tempfile dataset1
    save `dataset1'
    
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str8 dates double(cgjax cgjix cfjax)
    "07-01-15"       .       .       .
    "07-02-15" -.00051       0 -.00051
    "07-06-15" -.00253 -.00304 -.00663
    end
    tempfile dataset2
    save `dataset2'
    
    use `dataset1'
    gen usable_date = daily(date, "YMD")
    assert missing(usable_date) == missing(date)
    drop date
    format usable_date %td
    tempfile holding
    save `holding'
    
    use `dataset2'
    gen usable_date = daily(dates, "MD20Y")
    assert missing(usable_date) == missing(dates)
    format usable_date %td
    drop dates
    merge 1:1 usable_date using `holding'
    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment

    Working...
    X