Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • I need help to create a lagged variable

    I am currently working on a dataset (monthly flows of tourists between each pair of regions (O-D) during the period 2011-2015).

    Regarding the variables of interest:

    TEMP_D jt : Monthly average temperature in Region´s capitals of destination j during month t (º C)
    ORIGIN DESTINATION MONTH YEAR TEMPERATURE DESTINATION LAGGED TEMPERATURE DESTINATION
    1 1 1 2011 5 .
    1 1 2 2011 10 5
    1 2 1 2011 15 .
    1 2 2 2011 20 15
    I want to create the last column as a lagged (previous month in the same year) temperature at destination for each region.

    I tried to enter this command: gen int date = ym( year, month) gen tempd_L1= temp_d[_n-1] Repeated time values within panel I do not know how to create them when repeating temporal values ​​for each region
    Kindly regards, Cesar Muñoz

  • #2
    Welcome to Statalist.

    "Repeated time values within panel" typically is the result of an xtset command, not either of the commands you have shown.

    The commands you have shown are incorrect in any event because the first observation of the second destination will receive the temperature from the final observation of the previous destination.

    Given the commands you show, something like the following change might produce what you want.
    Code:
    generate int date = ym( year, month)
    by origin destination (date), sort: generate tempd_L1= temp_d[_n-1]
    To improve the helpfulness of your future posts, please take a few moments to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question.

    The more you help others understand your problem, the more likely others are to be able to help you solve your problem.


    Comment


    • #3

      Thank you William. That worked! This has saved me lots of time. Thanks again.

      Comment


      • #4
        I should have pointed out that my code in post #2 was a minimal modification to your code in post #1. It will suffer, as your code would also, from the problem that if there is no observation for a given month in your data, the "lagged value" computed for the month following the missing month will be the value from the previous observation, which may be 2 or more months earlier.

        A more thorough code would take into account the year and month of the previous observation, which would be stored as a Stata Internal Format monthly datetime variable rather than as two separate variables. Because your month and year are in separate variables, I'm concerned that you're new to working with dates and times in Stata. If so, the following advice may be useful.

        Stata's "date and time" variables are complicated and there is a lot to learn. If you have not already read the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF, do so now. If you have, it's time for a refresher. After that, the help datetime documentation will usually be enough to point the way. You can't remember everything; even the most experienced users end up referring to the help datetime documentation or back to the manual for details. But at least you will get a good understanding of the basics and the underlying principles. An investment of time that will be amply repaid.

        All Stata manuals are included as PDFs in the Stata installation (since version 11) and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu.


        Comment

        Working...
        X