Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to drop observations if one date variable is after another date variable

    Dear Statalist reader,

    My datafile has 2 date variables A (type: long, format: %d) and B (type: str9, format: %9s), A is like 12jul2011 and B is like 11-Nov-2011. I hope to drop all the observations if A is after B. I used the following command but it does not work (It shows type mismatch). I tried to change the format but I am not sure how to do that. Is there anyone can help me figure out the problem?

    Command I used: Drop if A>B

    Thank you!

  • #2
    To understand dates, see the help on dates

    Use list first to check:

    Code:
    list if A > daily(B, "DMY")

    Comment


    • #3
      Adding to Nick's spot-on advice, unless you plan to ignore date B entirely from that point on, you are best advised to convert it to a Stata internal numeric date. Dates as strings are almost never useful in Stata. You can't do calculations with them, and you can't even order them chronologically. Whenever I encounter a data set with string dates, I either -drop- those variables (if I know I won't need them), or I convert them to numeric dates right away.

      Comment


      • #4
        Thank you Nick and Clyde! The type is different indeed. I did tried translating the string dates to stata internal form, but all the observations in the new variable are marked as missing values. I have no idea which part is wrong with my commands. Here are my commands:
        generate testdate = date( date_of_test , "DMY")
        format testdate %td

        Comment


        • #5
          In #1 you said your string dates were like 11-Nov-2011

          If that's so, then your syntax should work.

          Code:
          . di date("11-Nov-2011", "DMY")
          18942
          But you also said that your dates were str9. That cannot be so, as 11 characters are needed for that kind of date.

          Even if you omit the hyphens, it should work too.

          Code:
          . di date("11Nov2011", "DMY")
          18942
          So, whatever you told us about your data appears wrong.

          Please read http://www.statalist.org/forums/help#stata and give us accurate information on your data.

          Comment


          • #6
            Really sorry for the confusion Nick. I checked the data and the information I posted is not accurate indeed. It is 14-Nov-11 actually, not 14-Nov-2011. I used the following commands and it works!

            gen testdate = date( date_of_test , "DM20Y")
            format testdate %td


            Thank you Nick!

            Comment


            • #7
              The moral of the story is to always use copy/paste when reporting things (or -dataex- when showing example data.). Re-typing can introduce subtle differences that the human eye and brain gloss over, but subtle differences are often all-important for Stata..

              Comment


              • #8
                I will keep this in mind. Thank you Clyde!

                Comment

                Working...
                X