Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extracting year from a non-uniform date variable

    Hello, I am working with a date variable (type is double) in a large dataset. It is of uneven length with some values YYYYMMDD, some YYYYMM and some just YYYY. I want to generate a new variable and extract just the YYYY from this variable. How can I do this? I came across the substr command for string variables, so was thinking of converting this numeric double variable to string but still unable to figure it out. I basically want to save the first four digits of each value. Will appreciate the guidance. Thank you!
    2004
    20150623
    20190802
    2017
    20140826
    200402
    20090519
    20090415
    20011011
    200807
    20171021

  • #2
    I figured it out using the substr command. Thank you!

    Comment


    • #3
      It would be a contribution to the Statalist community if you were to post the solution you found. Others may encounter a similar problem in the future and might benefit from seeing how you handled it.

      Comment


      • #4
        Here are two methods. One uses the substr() function (not command) and may be close to what Sohaib did. The other treats the problem numerically.

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input double date
            2004
        20150623
        20190802
            2017
        20140826
          200402
        20090519
        20090415
        20011011
          200807
        20171021
        end
        
        gen year = real(substr(strofreal(date, "%8.0f"), 1, 4))
        
        gen YEAR = cond(date > 1e7, floor(date/1e4), cond(date > 1e5, floor(date/1e2), date))
        
        sort date 
        
        list, sep(0)
        
             +------------------------+
             |     date   year   YEAR |
             |------------------------|
          1. |     2004   2004   2004 |
          2. |     2017   2017   2017 |
          3. |   200402   2004   2004 |
          4. |   200807   2008   2008 |
          5. | 20011011   2001   2001 |
          6. | 20090415   2009   2009 |
          7. | 20090519   2009   2009 |
          8. | 20140826   2014   2014 |
          9. | 20150623   2015   2015 |
         10. | 20171021   2017   2017 |
         11. | 20190802   2019   2019 |
             +------------------------+

        Comment

        Working...
        X