Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a new date variable common to each ID

    Hello

    I have a database of visitors to a clinic. Each visitor has an id and I have made the variable dup which numbers each visitor's visits in chronological order.

    I want to create the variables datefirst, datesecond etc. which give the date of the first visit, second visit etc. so that each case would contain all of the information on dates of visitors' visits.

    I tried the below but it is not working. Any help much appreciated.


    bysort id (dup) : gen datefirst = dateofvisit[1]
    bysort id (dup) : gen datesecond = dateofvisit[2]
    bysort id (dup) : gen datethird = dateofvisit[3]
    bysort id (dup) : gen datefourth = dateofvisit[4]

  • #2
    What does "not working" mean? FAQ Advice #12 :

    Never say just that something "doesn't work" or "didn't work", but explain precisely in what sense you didn't get what you wanted.

    Comment


    • #3
      Sorry it produces the variable datefirst which is a date but 90% of the cases have the same value which doesn't correspond to the date of the visitor's first visit

      It also gives a value for all cases. I would like those cases who did not have a second visit (no event of dup=2) to stay blank

      Comment


      • #4
        Nick is correct. Telling Statalist your code "didn't work" is like telling a doctor you don't feel well. Not much to work with there.

        Given the four lines of code shown, my guess about what "didn't work" is that failure to apply a suitable date format to the new variables meant that they do not display as a date when they are browsed or listed, but rather as a number.
        Code:
        . list, clean noobs
        
               mydate  
            17jul2018  
        
        . describe mydate
        
                      storage   display    value
        variable name   type    format     label      variable label
        ------------------------------------------------------------
        mydate          float   %td                   
        
        . generate newdate = mydate
        
        . list, clean noobs
        
               mydate   newdate  
            17jul2018     21382  
        
        . describe mydate newdate
        
                      storage   display    value
        variable name   type    format     label      variable label
        ------------------------------------------------------------
        mydate          float   %td                   
        newdate         float   %9.0g
        Stata's "date and time" variables are complicated and there is a lot to learn. If you have not already read the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF, do so now. If you have, it's time for a refresher. After that, the help datetime documentation will usually be enough to point the way. You can't remember everything; even the most experienced users end up referring to the help datetime documentation or back to the manual for details. But at least you will get a good understanding of the basics and the underlying principles. An investment of time that will be amply repaid.

        All Stata manuals are included as PDFs in the Stata installation (since version 11) and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu.

        Comment


        • #5
          OK, my guess in post #4 was a waste of my time.

          Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post, looking especially at sections 9-12 on how to best pose your question. In particular, please read FAQ #12 and help those whose help you seek by posting example data using the dataex command. If you are running Stata 15.1 or later, it is already installed. For earlier versions of Stata, install dataex by typing ssc install dataex. Type help dataex to read the simple instructions for using it. Using dataex will enable those who want to help you to quickly and easily create a 100% faithful replica of your situation to test their ideas and code on.

          The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

          Comment


          • #6
            Thanks for your input William apologies for the unclear post.
            I have already reformatted it into %d

            Hopefully the code below will clarify things. I want the variable datefirst to have the same value as dateofvisit when dup=1

            As you can see the var datefirst does not correspond to any of the dateofvisit values. 90% of all my cases return this datefirst value, regardless of id / dup/ dateofvisit values

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            
            bysort id (dup) : gen datefirst = dateofvisit[1]
            
            clear
            input str16 ID float(dateofvisit dup datefirst)
            "001" 20030 1 21286
            "001" 20103 2 21286
            "001" 20244 3 21286
            "001" 20255 4 21286
            "001" 20305 5 21286
            "001" 20529 6 21286
            "001" 20888 7 21286
            "001" 21165 8 21286
            "001" 21311 9 21286
            end

            Comment


            • #7
              It works for me.
              Code:
              clear
              input str16 ID float(dateofvisit dup datefirst)
              "001" 20030 1 21286
              "001" 20103 2 21286
              "001" 20244 3 21286
              "001" 20255 4 21286
              "001" 20305 5 21286
              "001" 20529 6 21286
              "001" 20888 7 21286
              "001" 21165 8 21286
              "001" 21311 9 21286
              end
              format %td dateofvisit datefirst
              bysort ID (dup) : gen new = dateofvisit[1]
              format %td new
              list, clean noobs
              Code:
                   ID   dateofv~t   dup   datefirst         new  
                  001   03nov2014     1   12apr2018   03nov2014  
                  001   15jan2015     2   12apr2018   03nov2014  
                  001   05jun2015     3   12apr2018   03nov2014  
                  001   16jun2015     4   12apr2018   03nov2014  
                  001   05aug2015     5   12apr2018   03nov2014  
                  001   16mar2016     6   12apr2018   03nov2014  
                  001   10mar2017     7   12apr2018   03nov2014  
                  001   12dec2017     8   12apr2018   03nov2014  
                  001   07may2018     9   12apr2018   03nov2014

              Comment


              • #8

                Thanks William When I extract data for one ID and run the code it works but I am still having issues when I run the code on the whole dataset. For 90% of the cases it generates the same value for datefirst

                Code:
                bysort ID (dup) : gen datefirst = dateofvisit[1]
                
                format %td datefirst

                Code:
                clear
                input str56 ID float(dateofvisit dup datefirst)
                "001" 19682 1 19851
                "001" 20833 2 19851
                "001" 21224 3 19851
                "002" 19851 0 19851
                "003" 21157 2 19851
                "003" 19835 1 19851
                "004" 20968 3 19851
                "004" 19519 1 19851
                "004" 20912 2 19851
                end
                format %d dateofvisit2
                format %d datefirst

                Comment


                • #9
                  I can't follow what you're doing. Small and large problems:

                  1. dataex would not produce the output in #8. In the example there is no variable dateofvisit2, so formatting it makes no sense. Don't mess with dataex output unless you really know what you're doing!

                  2. You're showing datefirst as a result but not the code used to create it. So your code doesn't allow the problem to be reproduced.

                  I get this, which looks exactly as intended.

                  Code:
                  clear
                  input str56 ID float(dateofvisit dup datefirst)
                  "001" 19682 1 19851
                  "001" 20833 2 19851
                  "001" 21224 3 19851
                  "002" 19851 0 19851
                  "003" 21157 2 19851
                  "003" 19835 1 19851
                  "004" 20968 3 19851
                  "004" 19519 1 19851
                  "004" 20912 2 19851
                  end
                  format %d dateofvisit
                  format %d datefirst
                  
                  bysort ID (dup) : gen DateFirst = dateofvisit[1]
                  
                  format DateFirst %td
                  
                  l, sepby(ID)
                  
                       +-----------------------------------------------+
                       |  ID   dateofv~t   dup   datefirst   DateFirst |
                       |-----------------------------------------------|
                    1. | 001   20nov2013     1   08may2014   20nov2013 |
                    2. | 001   14jan2017     2   08may2014   20nov2013 |
                    3. | 001   09feb2018     3   08may2014   20nov2013 |
                       |-----------------------------------------------|
                    4. | 002   08may2014     0   08may2014   08may2014 |
                       |-----------------------------------------------|
                    5. | 003   22apr2014     1   08may2014   22apr2014 |
                    6. | 003   04dec2017     2   08may2014   22apr2014 |
                       |-----------------------------------------------|
                    7. | 004   10jun2013     1   08may2014   10jun2013 |
                    8. | 004   03apr2017     2   08may2014   10jun2013 |
                    9. | 004   29may2017     3   08may2014   10jun2013 |
                       +-----------------------------------------------+
                  The display format %d still works but it's better to use the documented format %td.


                  Comment


                  • #10
                    I note that in post #1 you show code like
                    Code:
                    bysort id (dup) : gen datefirst = dateofvisit[1]
                    while in post #8 your sample data shows
                    Code:
                    input str56 ID float(dateofvisit dup datefirst)
                    So do you have two different variables, "id" and "ID"?

                    Comment


                    • #11
                      I edited the dataex output in my last post - won't do that again
                      Many thanks for the help the output is now as you would expect with the code

                      Comment

                      Working...
                      X