Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extracting years

    Hello I wanted to extract the year and month as separate variables from these dates. The year should be either 2007 or 2008. I suppose the last 2 digits denote the year. How do I separate it to say year==2007 or 2008.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long B2_q2i
    140608
    130608
    120608
    130608
    130608
    110608
    130608
    110608
    140608
    140608
    190608
    170608
     90708
    140608
    180608
    150608
    200608
    160608
    180608
    170608
     40608
     50608
     80608
     80608
     50608
     80608
     50608
     60608
     40608
     60608
    130608
    140608
    120608
    160608
    150608
    160608
    160608
    150608
    140608
    130608
     70508
     40508
     60508
     60508
     80508
     50508
     80508
     90508
     70508
     50508
    270408
    220408
    240408
    240408
    250408
    230408
    230408
    240408
    240408
    230408
    310508
    310508
    290508
    300508
    310508
    260508
    310508
    260508
    310508
    310508
    140408
    110408
    110408
    110408
    120408
    120408
    130408
    120408
    140408
    130408
    200408
    230408
    230408
    140408
    220408
    220408
    230408
    250408
    200408
    240408
    160608
     70608
    140608
    140608
    150608
    150608
    160608
     70608
    170608
    170608
    end
    ------------------ copy up to and including the previous line ------------------

    Listed 100 out of 125578 observations
    Use the count() option to list more

  • #2
    This all yields to some fooling around with standard functions.

    There is no road to understanding Stata dates and times that doesn't lead through

    Code:
    help datetime
    where some basic functions are discussed.

    In addition, the functions highlighted at https://journals.sagepub.com/doi/pdf...867X1101100308 are useful too and allow a direct approach. Evidently in this data example the year is 2000 plus the modulus on dividing the variable shown by 100, and so forth.

    But watch out: in your full dataset you may have some years of the form 19xx, in which case you need something more like

    Code:
    gen twodigityear = mod(B2_q2i, 100)
    gen YEAR = cond(twodigityear > 23, 1900 + twodigityear, 2000 + twodigityear)
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long B2_q2i
    140608
    130608
    120608
    130608
    130608
    110608
    130608
    110608
    140608
    140608
    190608
    170608
     90708
    140608
    180608
    150608
    200608
    160608
    180608
    170608
     40608
     50608
     80608
     80608
     50608
     80608
     50608
     60608
     40608
     60608
    130608
    140608
    120608
    160608
    150608
    160608
    160608
    150608
    140608
    130608
     70508
     40508
     60508
     60508
     80508
     50508
     80508
     90508
     70508
     50508
    270408
    220408
    240408
    240408
    250408
    230408
    230408
    240408
    240408
    230408
    310508
    310508
    290508
    300508
    310508
    260508
    310508
    260508
    310508
    310508
    140408
    110408
    110408
    110408
    120408
    120408
    130408
    120408
    140408
    130408
    200408
    230408
    230408
    140408
    220408
    220408
    230408
    250408
    200408
    240408
    160608
     70608
    140608
    140608
    150608
    150608
    160608
     70608
    170608
    170608
    end
    
    duplicates drop 
    
    * daily dates 
    gen ddate = daily(strofreal(B2_q2i, "%06.0f"), "DM20Y")
    format ddate %td 
    
    * monthly dates 
    gen mdate = mofd(ddate)
    format mdate %tm 
    
    * year and month 
    gen year = year(ddate)
    gen month = month(ddate)
    
    * a direct route
    gen YEAR = 2000 + mod(B2_q2i, 100)
    
    * check it gives the same answer 
    assert year == YEAR 
    
    * also direct 
    gen MONTH = floor(mod(B2_q2i, 1000)/100)
    assert month == MONTH 
    
    sort ddate
    
    list , sepby(month) 
    
        +-----------------------------------------------------------+
         | B2_q2i       ddate    mdate   year   month   YEAR   MONTH |
         |-----------------------------------------------------------|
      1. | 110408   11apr2008   2008m4   2008       4   2008       4 |
      2. | 120408   12apr2008   2008m4   2008       4   2008       4 |
      3. | 130408   13apr2008   2008m4   2008       4   2008       4 |
      4. | 140408   14apr2008   2008m4   2008       4   2008       4 |
      5. | 200408   20apr2008   2008m4   2008       4   2008       4 |
      6. | 220408   22apr2008   2008m4   2008       4   2008       4 |
      7. | 230408   23apr2008   2008m4   2008       4   2008       4 |
      8. | 240408   24apr2008   2008m4   2008       4   2008       4 |
      9. | 250408   25apr2008   2008m4   2008       4   2008       4 |
     10. | 270408   27apr2008   2008m4   2008       4   2008       4 |
         |-----------------------------------------------------------|
     11. |  40508   04may2008   2008m5   2008       5   2008       5 |
     12. |  50508   05may2008   2008m5   2008       5   2008       5 |
     13. |  60508   06may2008   2008m5   2008       5   2008       5 |
     14. |  70508   07may2008   2008m5   2008       5   2008       5 |
     15. |  80508   08may2008   2008m5   2008       5   2008       5 |
     16. |  90508   09may2008   2008m5   2008       5   2008       5 |
     17. | 260508   26may2008   2008m5   2008       5   2008       5 |
     18. | 290508   29may2008   2008m5   2008       5   2008       5 |
     19. | 300508   30may2008   2008m5   2008       5   2008       5 |
     20. | 310508   31may2008   2008m5   2008       5   2008       5 |
         |-----------------------------------------------------------|
     21. |  40608   04jun2008   2008m6   2008       6   2008       6 |
     22. |  50608   05jun2008   2008m6   2008       6   2008       6 |
     23. |  60608   06jun2008   2008m6   2008       6   2008       6 |
     24. |  70608   07jun2008   2008m6   2008       6   2008       6 |
     25. |  80608   08jun2008   2008m6   2008       6   2008       6 |
     26. | 110608   11jun2008   2008m6   2008       6   2008       6 |
     27. | 120608   12jun2008   2008m6   2008       6   2008       6 |
     28. | 130608   13jun2008   2008m6   2008       6   2008       6 |
     29. | 140608   14jun2008   2008m6   2008       6   2008       6 |
     30. | 150608   15jun2008   2008m6   2008       6   2008       6 |
     31. | 160608   16jun2008   2008m6   2008       6   2008       6 |
     32. | 170608   17jun2008   2008m6   2008       6   2008       6 |
     33. | 180608   18jun2008   2008m6   2008       6   2008       6 |
     34. | 190608   19jun2008   2008m6   2008       6   2008       6 |
     35. | 200608   20jun2008   2008m6   2008       6   2008       6 |
         |-----------------------------------------------------------|
     36. |  90708   09jul2008   2008m7   2008       7   2008       7 |
         +-----------------------------------------------------------+

    Comment

    Working...
    X