Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a variable from max value of another variable

    Hi all,
    I have a dataset in long format (multiple measurements per patient) with a continuous variable (creat) for serum creatinine values and a date variable (dt_creat) representing the date in which the creatinine values have been measured. I would like to:

    1) create a variable representing the peak value for creat for a given patient
    2) create a variable representing the date of the peak value

    I can do the first using egen max, but can't figure out how to do the second. Using Stata version 13.1.

    Any help would be appreciated.

    Thanks,
    Manuel Ferraro

  • #2
    what date do you want to use if the max occurs more than once?

    Comment


    • #3
      Hi Rich. I would use the date of the first occurrence.

      Comment


      • #4
        Manuel:
        welcome to the list.
        You may be interested in something along the following lines:
        Code:
        . use "http://www.stata-press.com/data/r14/nlswork.dta", clear
        (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
        
        . bysort idcode: egen max_tenure=max(tenure)
        (12 missing values generated)
        
        . g date=year if tenure==max_tenure
        (23,628 missing values generated)
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Hi Carlo, thank you for reply. Following your sample code I get multiple dates if the maximum value recurs more than once within the same patient. But I guess I can then use egen min to select the date of the first occurrence, correct?

          Manuel

          Comment


          • #6
            Carlo's helpful code doesn't quite get at the question of ties raised by Rich. Very likely, their posts crossed.

            Code:
            use "http://www.stata-press.com/data/r14/nlswork.dta", clear
            bysort idcode: egen max_tenure = max(tenure)
            by idcode: egen date_of_max = min(cond(tenure == max_tenure, year, .))


            should get you closer.

            Comment


            • #7
              Hi Nick, thank you for your help, the code you sent solves it perfectly (I had introduced a further line with egen min but yours is more efficient). Would it be possible to do the same for the last value? I tried using egen rowlast but it can't be combined with by.

              Thanks,
              Manuel

              Comment


              • #8
                Nick is right.
                I replied without paying attention at previous contributions.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Would it be possible to do the same for the last value?
                  What's the problem here? The last value occurs at the last date of measurement; that's my guess. Is your definition different?

                  Comment


                  • #10
                    Sorry, I didn't express myself well. The original question was about isolating the date of the max value for the creat variable, the subsequent question was about creating a variable reporting the date of the last creat measurement (so that, for example, I can calculate the time elapsed from the max to the last creat). I intend to limit the final dataset to one record per patient.

                    Comment


                    • #11
                      Something like

                      Code:
                       
                       by idcode: egen date_of_last = max(cond(tenure < . , year, .))
                      may help.

                      Comment


                      • #12
                        Thank you. I thought there was a way to directly select the last observation within a group using egen or some other function. This is how my final code would look like:

                        Code:
                        use "http://www.stata-press.com/data/r14/nlswork.dta", clear
                        
                        sort idcode year
                        
                        by idcode: egen max_tenure = max(tenure)
                        by idcode: egen date_of_max = min(cond(tenure == max_tenure, year, .))
                        
                        by idcode: egen min_tenure = min(tenure)
                        by idcode: egen date_of_min = min(cond(tenure == min_tenure, year, .))
                        
                        by idcode: gen last = 1 if _n == _N
                        
                        by idcode: egen last_tenure = min(cond(last == 1, tenure, .))
                        by idcode: egen date_of_last = min(cond(last == 1, year, .))
                        
                        drop last
                        
                        by idcode: keep if _n == 1

                        Comment


                        • #13
                          Note that you could rewrite

                          Code:
                          by idcode: gen last = 1 if _n == _N  
                          by idcode: egen last_tenure = min(cond(last == 1, tenure, .))
                          by idcode: egen date_of_last = min(cond(last == 1, year, .))  
                          drop last
                          as

                          Code:
                          bysort idcode (year): gen last_tenure = tenure[_N]  
                          by idcode: gen date_of_last = year[_N]

                          Comment

                          Working...
                          X