Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identify date belonging to highest and lowest value

    Hi everyone,

    I am new to Stata, using version SE 15.1.

    In my dataset i'd like to 1) identify the time variable belonging to the highest value from another variable, and 2) identify the time variable belonging to the lowest value from another variable (when it's after the highest level and before it rises again). And then use both as cutoff points.
    Because my dataset is confidential, I provided a fake example instead.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(concentration date)
     20 22646
     30 22647
     50 22648
     55 22649
    100 22650
     90 22651
     75 22652
     70 22653
     20 22654
     10 22655
     20 22656
     30 22657
      5 22658
    end
    format %d date
    My data contains one variable (concentration) with a certrain value belonging to hormone level concentration. The other variable (time) belongs to the date that the blood sample was collected.
    The data above represents one person.
    So in the fake example the answer to my first question would be: 05jan2022. And the answer to my second question would be 10jan2022.

    Up to know I can't seem to figure out how to do this.

    With the following code I am able to create a new variable containing the highest value of the concentration variable. I then made a new variable containing the date belonging to the highest value. I can substract this from the variable date and create a new variable, time.
    Code:
    egen max = max(concentration) /*identify highest concentration*/
    generate startdate=date if concentration==max
    format startdate %td
    
    egen date_atmax=max(startdate) /* identify date belonging to highest concentration */
    format date_atmax %td
    drop startdate
    
    generate time=(date - date_atmax) /* then t=0 belongs to the highest point */
    Of course I can now drop the observations below 0. However, in my dataset it is wrong to say that the highest point is the starting point. The real starting point could be 06jan2022, depending on the different persons that are included. This time difference will not be the same for every person in my study.
    If I want to make a scatter plot or a regression between the highest and lowest value I'd like to use a local. The syntax below is of course not working, because the local is not from the variable time.
    Code:
    local start=date_atmax 
    scatter concentration time if inrange(time,`start',5)
    I initially wanted to create a local like this
    Code:
    local date_atmax=date if concentration==max
    But I cannot use if in a local.

    Is there any way I can identify the date belonging to the highest concentration more easily?
    I have not found a solution for the date belonging to the lowest concentration (where the date must be after the highest concentration, and before it start rising again).

    I hope my post is clear and if I did something in contrast to what the FAQ list displays please tell me.

    Thanks in advance.

  • #2
    Also asked (and answered) at https://www.statalist.org/forums/for...d-use-in-local

    Comment

    Working...
    X