Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • date format

    Dear Stata users,

    I have 15 different spreadsheets that I would like to import in stata using a loop.
    I've constructed a loop in a way to import all spreadsheets at once. However, I noted that dates are imported as string and, particularly, these dates are in different format: DMY (eg 31/8/2012), MDY (eg 3/22/2001) and string (eg 25feb2014).

    Is there a way to import all dates in the format DMY by inserting the some codes in the loop?

    Thank you in advance for your help.

    Regards,
    F

  • #2
    In don't think you can do what you suggest. But don't be too sad. You just need to make clear to yourself what the actual format of each variable is, and in Stata transform it to a date variable, using the date() function with the appropriate mask, "DMY" for your first and third example, "MDY" for your second example.

    Comment


    • #3
      Dear Svend,

      thank you. I gave some thoughts to what you are suggesting.
      After importing data into Stata, I will check the date format and replace it using the appropriate mask.

      In case multiple formats (eg DMY and string such as '1Jan2014'), I think I can use the 'cond' command.
      To clarify: cond(strpos(`var', "/"), date(`var', "MDY"), date(`var', "DMY"))

      The only uncertainty could be caused by stata importing dates as DMY and MDY in the same variable.
      Is that possible?


      Thank you for your support. Really appreciated.
      F

      Comment


      • #4
        There are many code solutions here. One that is dirty but quick is to generate two variables.

        Code:
        gen date_dmy = daily(mystring, "DMY")
        gen date_mdy = daily(mystring, "MDY")
        and then use whichever produces fewer missing values, sheet by sheet. A less clumsy way is to inspect each set of dates and choose the smartest option depending on the contents, but that will boil down to the same thing, very likely.

        Note that daily() is an undocumented synonym for date(). I wish StataCorp would make it more prominent, as too many users misread date() as a general date function that will produce any kind of date (contrary to the documentation, but not that surprisingly, given the name).
        Last edited by Nick Cox; 11 Dec 2014, 05:15.

        Comment


        • #5
          The date() function is clever, and dates with the "%td" format, like 1jan2014, will also be read correctly with the "DMY" mask, so the cond(()) construct is not necessary.

          I think you are right that Excel can have different display formats in the same column. If you suspect that, it must be fixed in Excel; it can never take place in a Stata dataset. Try to impose a single date format in the suspect Excel column - and look carefully; one never knows with Excel.

          Comment


          • #6
            Note that checking for the occurrence of forward slashes is not a discriminator between MDY and MDY. On the evidence of post #1, you can have forward slashes in either case!

            Comment


            • #7
              However, if you can have different date formats in the same spreadsheet, then arghhh! You probably don't have a choice but I wouldn't work with people so messy!

              Comment

              Working...
              X