Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Substract consecutive variables

    Hi Nick Cox and Clyde Schechter,

    I have a very simple question, but I do not how to solve it. My problem is as follows: I have near to 600 variables which are from _02Jan2020 to _31Jan2020 (all of them are integers). They are ordered, so I want to obtain multiple outcome variables that are defined as the difference between a column and the previous one. For instance, _03Jan2020_new must be (_03Jan2020 - _02Jan2020), _04Jan2020_new must be (_04Jan2020 - _03Jan2020), and so on. I tried to do this task with the following code, but it did not work. Do you have any thought about how to solve it?

    Thanks in advance!

    Code:
        foreach x of varlist _02Jan2020-_31Dec2021 {
            local prev = `x'-1
            gen `x'_new = `x' - `prev'
        }

  • #2
    Your approach cannot work because your iterator x will take on values like "_03Jan2020", a string. While, in your mind, this corresponds to a date and `x'-1 corresponds to the preceding date, to Stata this is not even remotely true. To Stata, `x'-1 looks like "_03Jan2020-1." And when you then try to create `x'_new, the - operation is meaningless as it involves two strings.

    That said, in Stata, there is almost nothing useful that can be done without contortions in a data set having 600 variables named after dates. You will find your data is much easier to work with if you convert it to the long layout, where date is itself a variable (and is a numeric Stata internal format date variable). The following code implements this:
    Code:
    rename (_02Jan2020-_31Dec2021) outcome=
    gen `c(obs_t)' = _n
    reshape long outcome, i(obs_no) j(_date) string
    replace _date = subinstr(date, "_", "", 1)
    gen date = daily(_date, "DMY"), after(_date)
    assert !missing(date)
    format date %td
    drop _date
    xtset obs_no date
    gen wanted = D1.outcome
    While I do not know what you plan to do next with your data, given how unwieldy the original data layout was, I can almost guarantee that it will be much easier with this data arrangement. So I recommend you do it this way.

    Note: As no example data was provided, this code is untested and may contain errors. The gist of it is correct; if you are unable to adapt it to your data set, do post back with example data, using the -dataex- command to do so, and I will try to troubleshoot.

    As an aside, advice for the future: it is not in your interest to address your post to me, or Nick, or anybody else. There are many Forum members who can answer this question. One or more of them may have seen it before I got to it, and passed it by. You gained nothing, and you may have missed an earlier response. I think it is best to address a post to an individual person only if:
    1. It is a niche topic in which they, and few others, have shown an interest in their previous posts here.
    2. You are questioning or responding to something that they have previously posted, either in the current thread or others, and you wish to discuss, clarify, disagree, endorse, illustrate, challenge, elaborate upon, express mystification about...
    3. Or, you have received responses to a post from several people, and you wish to distinguish to which one you are now answering back.
    Last edited by Clyde Schechter; 11 Jun 2023, 17:52.

    Comment


    • #3
      I completely agree with Clyde that reshaping the data to long form is likely the most helpful way to go for you, if you have subsequent operations that use the temporal aspects of the data.

      If your subsequent code does not need the date aspect however, here is a little bit of code that narrowly accomplishes what you wanted:

      Code:
      unab date_vars: _02Jan2020 - _31Dec2021
      local numvars: list sizeof date_vars
      
      forval i = 2/`numvars' {
          local x: word `i' of `date_vars'
          local prev: word `=`i'-1' of `date_vars'
          
          gen `x'_new = `x' - `prev'
      }

      Comment


      • #4
        It did work! Thanks Hemanshu Kumar!!!

        Comment


        • #5
          I can only endorse Clyde Schechter's comments in #2 about pings.

          At its bluntest, we are not servants or staff to be summoned by a bell! And even if the intent is merely "You may be able to answer this" the ping still shows up in my inbox and makes me feel I should answer.

          More generally, even people very active here do not necessarily have the time, inclination or ability to answer something just because you asked us directly. So let people decide to answer what they want to answer.
          Last edited by Nick Cox; 13 Jun 2023, 01:41.

          Comment

          Working...
          X