Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extract Year from a string variable

    Hi everyone, I would like to ask some help in order to extract year from a string variable (year of incorporation) that is structured in this way: 12/9/1998.

    I tried to encode the variable (year of incorporation), change the format in %td and then use the code year(year of incorporation) but Stata returns me a year that does not match with the one in the date. For example, for 12/9/1998 it returns 1970.

    How can I solve the problem?

    Thanks!

  • #2
    encode cannot produce sensible results here. You are likely to get completely nonsensical results. For example as a string "12/9/1990" sorts before "23/1/1952", which is nonsense from your point of view. How would encode know to ignore the day and month information?


    This is explained in various places, e.g. https://www.stata-journal.com/articl...article=dm0098

    Code:
    gen year = real(substr(whatever, -4. 4))
    is a better way forward.


    Code:
    gen year = real(substr(trim(whatever), -4. 4))


    is an even better way forward if you had trailing spaces. whatever should be replaced by the variable name in question.

    Comment


    • #3
      encode is not the right tool for this; assuming that the year is always the last 4 characters, try this:
      Code:
      gen year=real(substr(date,-4,4))
      replace "date" with the name of your variable (which you did not give us); I have made year numeric but that is just a guess about what you want; if you want it as string, just remove "real" from the command

      also, you might want to use the trim function; see
      Code:
      help strtrim()

      Comment


      • #4
        Or you can firstly turn this into a Stata date, and then use date functions, like this (I assumed that what you are showing is Month/Day/Year):

        Code:
        . clear
        
        . set obs 1
        number of observations (_N) was 0, now 1
        
        . gen str date = "12/9/1998"
        
        . gen statadate = date(date,"MDY")
        
        . format statadate %td
        
        . gen year = year(statadate)
        
        . list
        
             +------------------------------+
             |      date   statadate   year |
             |------------------------------|
          1. | 12/9/1998   09dec1998   1998 |
             +------------------------------+

        Comment


        • #5
          Thanks a lot, I was able to solve. Sorry for the error in using encode, I am still a newbie with Stata.

          Comment


          • #6
            Maybe you can use numdate and extrdate command to deal with date.
            Code:
            * To install: ssc install numdate/extrdate
            clear
            input str9 year
            "12/9/1998"
            "12/9/1998"
            "12/9/1998"
            "12/9/1998"
            "12/9/1998"
            "12/9/1998"
            end
            
            numdate d year1=year,p(MDY) 
            extrdate y wanted=year1
            
                 +--------------------+
                 |     year1   wanted |
                 |--------------------|
              1. | 09dec1998     1998 |
              2. | 09dec1998     1998 |
              3. | 09dec1998     1998 |
              4. | 09dec1998     1998 |
              5. | 09dec1998     1998 |
                 +--------------------+
            Best regards.

            Raymond Zhang
            Stata 17.0,MP

            Comment


            • #7
              As a new user of Stata, you have now seen that Stata's "date and time" variables are complicated and there is a lot to learn. If you have not already read the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF, do so now. If you have, it's time for a refresher. After that, the help datetime documentation will usually be enough to point the way. You can't remember everything; even the most experienced users end up referring to the help datetime documentation or back to the manual for details. But at least you will get a good understanding of the basics and the underlying principles. An investment of time that will be amply repaid.

              All Stata manuals are included as PDFs in the Stata installation (since version 11) and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu.

              And let me also offer you the following advice that I offer to anyone who identifies as a new user of Stata.

              I'm sympathetic to you as a new user of Stata - there is quite a lot to absorb.

              When I began using Stata in a serious way, I started, as have others here, by reading my way through the Getting Started with Stata manual relevant to my setup. Chapter 18 then gives suggested further reading, much of which is in the Stata User's Guide, and I worked my way through much of that reading as well. There are a lot of examples to copy and paste into Stata's do-file editor to run yourself, and better yet, to experiment with changing the options to see how the results change.

              All of these manuals are included as PDFs in the Stata installation and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu. The objective in doing the reading was not so much to master Stata - I'm still far from that goal - as to be sure I'd become familiar with a wide variety of important basic techniques, so that when the time came that I needed them, I might recall their existence, if not the full syntax, and know how to find out more about them in the help files and PDF manuals.

              Stata supplies exceptionally good documentation that amply repays the time spent studying it - there's just a lot of it. The path I followed surfaces the things you need to know to get started in a hurry and to work effectively.

              Comment

              Working...
              X