Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating daily lagged variables with hourly data

    Hi,
    I have an hourly dataset with weather information from different weather stations.
    For each day I created a maximum temperature variable Tmax of which I want to have the lagged version.
    The dataset looks something like this:
    ID datetime Tmax
    1 02jan2000 01:00:00 4.1
    1 02jan2000 02:00:00 4.1
    1 03jan2000 05:00:00 5.8
    1 03jan2000 06:00:00 5.8
    1 04jan2000 01:00:00 3.1
    2 02jan2000 03:00:00 2.2
    2 02jan2000 05:00:00 2.2
    2 03jan2000 01:00:00 3.3
    2 03jan2000 02:00:00 3.3
    Thus, station 1 has a maximum temperature of 4.1 on January 2nd, of 5.8 on Januar 3rd and of 3.1 on January the 4th.
    Station 2 has a max temperature of 2.2 and of 3.3 on January 2nd and 3rd respectively.

    Now, I'd like to have a new variable lagged_tmax, which contains the maximum temperature of january 2nd on January 3rd, thus it is missing on January 2nd and takes the value 4.1 on January 3rd and so on.
    Furthermore notice that some days contain 23 hours of observations and others maybe only 20 hours of observations, such that the daily number of observations varies.

    I hope someone can help me out with this problem. Thank you very much in advance,
    Philipp

  • #2
    Something like this might work.
    Code:
    generate lagTmax = .
    sort ID datetime
    by ID (datetime) : replace lagTmax =    Tmax[_n-1] if _n>1 & dofc(datetime) >  dofc(datetime[_n-1])
    by ID (datetime) : replace lagTmax = lagTmax[_n-1] if _n>1 & dofc(datetime) == dofc(datetime[_n-1])

    Comment


    • #3
      Perfect, thank you very much!

      Comment

      Working...
      X