Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Date values are different

    Hello, I have values for the dates given in LB21B.
    I used the code egen LB21B_12=convertCMC(LB21B)
    when I apply %td on both values, none of its showing the correct date and both of them are different form each other as well. The right years would be somewhere beteen 2000 and 2005, but here some times it goes back to 1900, or 1959 or forward even 3033.
    How do I correct this?
    Im using Stata 14 MP.

    Code:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long(LB21B LB21B_12)
    180605       .
    180605       .
    140103       .
    140103       .
    140103       .
    140103       .
    140103       .
    140103       .
    140103       .
     10100  285474
     10100  285481
    300903       .
    300903       .
         0       .
         0       .
    150802       .
    150802       .
    150802       .
    150802       .
    150802       .
         0       .
         0       .
         0       .
         0       .
    280604       .
         0       .
         0       .
         0       .
         0       .
         0       .
     70600 2126902
     70600 2126905
    260604       .
    260604       .
    260604       .
    260604       .
    260604       .
    261002       .
    120604       .
    120604       .
    120604       .
    131101       .
    290203       .
    290203       .
    270802       .
    270802       .
    270802       .
         0       .
         0       .
    170604       .
    170604       .
     10301  291601
     10301  291602
     10301  291595
     90903 2744870
     90903 2744865
     90903 2744869
    121001       .
    250505       .
    250505       .
    250505       .
    280103       .
    280103       .
    220302       .
    240604       .
    240604       .
    240604       .
     51102 1533454
     51102 1533445
     50103 1503054
     10702  303815
     10702  303803
     10702  303806
     10702  303806
     10702  303810
     10702  303805
     10702  303805
     10702  303806
    301103       .
    301103       .
     50404 1512210
    200700       .
     51002 1530403
    120304       .
    120504       .
    120504       .
    130503       .
    130304       .
    130304       .
    110603       .
    110603       .
    171203       .
    171203       .
    180902       .
    180902       .
     91102 2750919
     91102 2750933
    250603       .
    250603       .
     21203  623438
    end
    format %td LB21B
    format %td LB21B_12

  • #2
    I don't know what convertCMC() is but it appears to be producing useless output. You're asked to explain community-contributed code that you use.


    I guess that the 0s are really missings. Then I would do this.

    Some experiments indicate that daily() struggles with some of the input here, obliging more machinery. I didn't try numdate from SSC, which is a step more versatile. Stata's pointing out that there wasn't a 29 February 2003 seems reasonable enough.


    Code:
    clear
    input long(LB21B LB21B_12)
    180605       .
    180605       .
    140103       .
    140103       .
    140103       .
    140103       .
    140103       .
    140103       .
    140103       .
     10100  285474
     10100  285481
    300903       .
    300903       .
         0       .
         0       .
    150802       .
    150802       .
    150802       .
    150802       .
    150802       .
         0       .
         0       .
         0       .
         0       .
    280604       .
         0       .
         0       .
         0       .
         0       .
         0       .
     70600 2126902
     70600 2126905
    260604       .
    260604       .
    260604       .
    260604       .
    260604       .
    261002       .
    120604       .
    120604       .
    120604       .
    131101       .
    290203       .
    290203       .
    270802       .
    270802       .
    270802       .
         0       .
         0       .
    170604       .
    170604       .
     10301  291601
     10301  291602
     10301  291595
     90903 2744870
     90903 2744865
     90903 2744869
    121001       .
    250505       .
    250505       .
    250505       .
    280103       .
    280103       .
    220302       .
    240604       .
    240604       .
    240604       .
     51102 1533454
     51102 1533445
     50103 1503054
     10702  303815
     10702  303803
     10702  303806
     10702  303806
     10702  303810
     10702  303805
     10702  303805
     10702  303806
    301103       .
    301103       .
     50404 1512210
    200700       .
     51002 1530403
    120304       .
    120504       .
    120504       .
    130503       .
    130304       .
    130304       .
    110603       .
    110603       .
    171203       .
    171203       .
    180902       .
    180902       .
     91102 2750919
     91102 2750933
    250603       .
    250603       .
     21203  623438
    end 
    
     drop if LB21B == 0 
     drop LB21B_12 
     
     gen year = 2000 + mod(LB21B, 100)
     gen month = floor(mod(LB21B, 10000)/100)
     gen day = floor(LB21B/10000)
     
     gen wanted = mdy(month, day, year)
     format wanted %td 
     
     sort wanted 
     
     list 
     
     
    
    
    
         +-----------------------------------------+
         |  LB21B   year   month   day      wanted |
         |-----------------------------------------|
      1. |  10100   2000       1     1   01jan2000 |
      2. |  10100   2000       1     1   01jan2000 |
      3. |  70600   2000       6     7   07jun2000 |
      4. |  70600   2000       6     7   07jun2000 |
      5. | 200700   2000       7    20   20jul2000 |
         |-----------------------------------------|
      6. |  10301   2001       3     1   01mar2001 |
      7. |  10301   2001       3     1   01mar2001 |
      8. |  10301   2001       3     1   01mar2001 |
      9. | 121001   2001      10    12   12oct2001 |
     10. | 131101   2001      11    13   13nov2001 |
         |-----------------------------------------|
     11. | 220302   2002       3    22   22mar2002 |
     12. |  10702   2002       7     1   01jul2002 |
     13. |  10702   2002       7     1   01jul2002 |
     14. |  10702   2002       7     1   01jul2002 |
     15. |  10702   2002       7     1   01jul2002 |
         |-----------------------------------------|
     16. |  10702   2002       7     1   01jul2002 |
     17. |  10702   2002       7     1   01jul2002 |
     18. |  10702   2002       7     1   01jul2002 |
     19. |  10702   2002       7     1   01jul2002 |
     20. | 150802   2002       8    15   15aug2002 |
         |-----------------------------------------|
     21. | 150802   2002       8    15   15aug2002 |
     22. | 150802   2002       8    15   15aug2002 |
     23. | 150802   2002       8    15   15aug2002 |
     24. | 150802   2002       8    15   15aug2002 |
     25. | 270802   2002       8    27   27aug2002 |
         |-----------------------------------------|
     26. | 270802   2002       8    27   27aug2002 |
     27. | 270802   2002       8    27   27aug2002 |
     28. | 180902   2002       9    18   18sep2002 |
     29. | 180902   2002       9    18   18sep2002 |
     30. |  51002   2002      10     5   05oct2002 |
         |-----------------------------------------|
     31. | 261002   2002      10    26   26oct2002 |
     32. |  51102   2002      11     5   05nov2002 |
     33. |  51102   2002      11     5   05nov2002 |
     34. |  91102   2002      11     9   09nov2002 |
     35. |  91102   2002      11     9   09nov2002 |
         |-----------------------------------------|
     36. |  50103   2003       1     5   05jan2003 |
     37. | 140103   2003       1    14   14jan2003 |
     38. | 140103   2003       1    14   14jan2003 |
     39. | 140103   2003       1    14   14jan2003 |
     40. | 140103   2003       1    14   14jan2003 |
         |-----------------------------------------|
     41. | 140103   2003       1    14   14jan2003 |
     42. | 140103   2003       1    14   14jan2003 |
     43. | 140103   2003       1    14   14jan2003 |
     44. | 280103   2003       1    28   28jan2003 |
     45. | 280103   2003       1    28   28jan2003 |
         |-----------------------------------------|
     46. | 130503   2003       5    13   13may2003 |
     47. | 110603   2003       6    11   11jun2003 |
     48. | 110603   2003       6    11   11jun2003 |
     49. | 250603   2003       6    25   25jun2003 |
     50. | 250603   2003       6    25   25jun2003 |
         |-----------------------------------------|
     51. |  90903   2003       9     9   09sep2003 |
     52. |  90903   2003       9     9   09sep2003 |
     53. |  90903   2003       9     9   09sep2003 |
     54. | 300903   2003       9    30   30sep2003 |
     55. | 300903   2003       9    30   30sep2003 |
         |-----------------------------------------|
     56. | 301103   2003      11    30   30nov2003 |
     57. | 301103   2003      11    30   30nov2003 |
     58. |  21203   2003      12     2   02dec2003 |
     59. | 171203   2003      12    17   17dec2003 |
     60. | 171203   2003      12    17   17dec2003 |
         |-----------------------------------------|
     61. | 120304   2004       3    12   12mar2004 |
     62. | 130304   2004       3    13   13mar2004 |
     63. | 130304   2004       3    13   13mar2004 |
     64. |  50404   2004       4     5   05apr2004 |
     65. | 120504   2004       5    12   12may2004 |
         |-----------------------------------------|
     66. | 120504   2004       5    12   12may2004 |
     67. | 120604   2004       6    12   12jun2004 |
     68. | 120604   2004       6    12   12jun2004 |
     69. | 120604   2004       6    12   12jun2004 |
     70. | 170604   2004       6    17   17jun2004 |
         |-----------------------------------------|
     71. | 170604   2004       6    17   17jun2004 |
     72. | 240604   2004       6    24   24jun2004 |
     73. | 240604   2004       6    24   24jun2004 |
     74. | 240604   2004       6    24   24jun2004 |
     75. | 260604   2004       6    26   26jun2004 |
         |-----------------------------------------|
     76. | 260604   2004       6    26   26jun2004 |
     77. | 260604   2004       6    26   26jun2004 |
     78. | 260604   2004       6    26   26jun2004 |
     79. | 260604   2004       6    26   26jun2004 |
     80. | 280604   2004       6    28   28jun2004 |
         |-----------------------------------------|
     81. | 250505   2005       5    25   25may2005 |
     82. | 250505   2005       5    25   25may2005 |
     83. | 250505   2005       5    25   25may2005 |
     84. | 180605   2005       6    18   18jun2005 |
     85. | 180605   2005       6    18   18jun2005 |
         |-----------------------------------------|
     86. | 290203   2003       2    29           . |
     87. | 290203   2003       2    29           . |
         +-----------------------------------------+

    Comment


    • #3
      what is convertCMC and where did you get it since it does not appear to be part of official Stata?

      Comment


      • #4
        I found

        https://github.com/BugBunny/convertCMC

        where it gives a definition

        CMCs (i.e. dates coded as months since the start of the year 1900)
        As that date representation is not what you have, the code is not what you need. Starting with the CMC definition and saying what different to do seems pointless, as you can just start with the dates you have.

        Comment


        • #5
          Sara George Several things:

          The convertCMC function for egen is not inbuilt in Stata. It is a user-contributed feature. I had to google it up, others may not bother to do so. So first, for the sake of others reading this thread, the required step for installation is:
          Code:
          net install convertCMC, from(https://raw.githubusercontent.com/bugbunny/convertCMC/master)
          Second, this function is used to convert CMC dates to Stata dates. In CMC format, 1 is January 1900, 2 is February 1900, and so on. So two things:
          • You are feeding in LB21B into this function, and it takes values like 0, 10702, even 300903. 0 is not a legal CMC date, I think. Further, 10702 would be 10702/12 = more than 891 years after 1900, and 300903 would be more than 25,000 years after the start of 1900. Bottomline: your LB21B variable is not in CMC format to start with.
          • also, you used format %td to format this variable LB21B. This is never going to give you the right date because the %td format is a daily format (not a monthly format like CMC), and it counts dates from January 1, 1960. So a 1 in CMC is January 1900 while a 1 in Stata's daily format is 02 January 1960.
          You need to figure out your data correctly before you use either's Stata's internal date time functions, or the convertCMC function. See also
          Code:
          help datetime
          Last edited by Hemanshu Kumar; 31 Oct 2022, 05:04.

          Comment


          • #6
            I understand. I appreciate all of your help.

            Comment


            • #7
              I can get the same results with numdate from SSC if I pad to the same length: see https://www.statalist.org/forums/for...date-variables

              Code:
              gen LB21Bs = cond(LB21B < 1e5, "0", "") + strofreal(LB21B)
               numdate daily WANTED = LB21Bs, pattern(DMY) topyear(2010)
              See

              Comment


              • #8
                Assuming the numbers in the original dataset were auto-generated somewhere, I'm still mystified by the presence of 290203.

                Comment

                Working...
                X