Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Date format question

    Hello, I'm trying to merge two datasets using a variable called sur_mon_yr that is month/year. E.g. 01/2010
    When merging by sur_mon_yr, there is no match.
    I suspect this is because one dataset has sur_mon_yr in %tmNN/CCYY format whereas the other dataset uses %tdNN/CCYY format.
    I tried changing the format of the variables to match but it's not working.
    For example, if %tdNN/CCYY is changed to %tmNN/CCYY, the observation changes from 01/2010 to 01/3497.
    On the other hand, if %tmNN/CCYY is changed to %tdNN/CCYY, the observation changes from 01/2020 to 08/1961.
    I'm not sure how to get the two sur_mon_yr variables to have the same format such that they can merge and match.

    Many thanks for your help,
    Harry

  • #2
    WHen you -merge- to data sets, the key variables are matched on their values, not on their appearance. Changing the format from %tm to %td (or any other format change) does nothing whatsoever to the actual values of the variables. It just tells Stata to display them a different way. But if they didn't match before the change, they won't match after either.

    So what you have to do is transform the variable in one of the data sets so that it matches the other. It sounds like one set has a genuine monthly date (the one where %tmNN/CCYY makes the variable look like a real monthly date), and the other has a genuine daily date. Since daily dates contain information that isn't present in monthly dates (unless the daily dates are redundant, such as always being the first day of the month or something like that), the only way to get them to agree is to transform the daily date to a monthly date. That is why StataCorp cretaed the mofd() function. Since you may ultimately need the day information in the daily date, I suggest that you do this by creating a new variable. So something like this:

    Code:
    use data_set_containing_daily_date_variable
    rename sur_mon_yr sur_mon_yr_daily
    gen sur_mon_yr = mofd(sur_mon_yr_daily)
    format sur_mon_yr %tmNN/CCYY // THIS IS OPTIONAL AND ONLY FOR CONVENIENCE
    merge /*1:1 or m:1 or 1:m as the case may be*/ sur_mon_yr using other_data_set 
    If that does not do what you are expecting, then I am somehow misunderstanding your description of the situation. That's not surprising really: even the most detailed descriptions in words of data sets are often inadequate. So if something different from this is needed, when you post back, please use the -dataex- command to show examples of both data sets. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    And, in the future, when seeking help with code, it is usually wise to show example data right from the start.

    Comment


    • #3
      The code did work! Thank you for your explanation, I really appreciate your help.

      Sincerely,
      Harry

      Comment


      • #4
        Changing the display format changes the display format.The underlying values are unchanged. More at https://www.stata-journal.com/articl...article=dm0067

        Comment

        Working...
        X