Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • looping over numlist of time series operators

    Dear users,

    I have a panel of life expectancy (e0) time series (yearly). I'm exploring alternative ways to categorize years affected by short-run deviations from trend with an eye to control for them in cross-country analyses. Setting filters and interpolations aside, I decided first of all to see what happens with the primitive criterion of excluding those years altogether. My problem concerns their identification. I define those "shock years" as the set of years of declining life expectancy plus years of recovery - the latter defined as the x consecutive years following the x consecutive years of decline. So, to exemplify, if a country time series displays 5 years of consecutive negative change in e0, I want to flag these years and the consecutive 5 years, as respectively "crisis" and "recovery".

    I tried this:

    Code:
    gen e0diff = f.e0-e0
    
    gen crisis = 1 if e0diff<0
    
    gen rcvy = 1 if !mi(e0diff) & e0diff>0 & l.crisis==1
    
    forvalues n = 1/100 {
    replace rcvy = 1 if !mi(e0diff) & e0diff>0 & L(1/`n').rcvy==1 & ((2*`n')+1)L.crisis==1
    }
    Stata will not allow me to loop over the numlist of the time series operator because it understands L() as a function - one that indeed does not exist. I could avoid looping, given that the longest sequence of years of decline is not that long. However, I would like to understand my mistake properly and learn ways out - I'm new to time series operators and I might have missed something in the relevant TS guide entries.

    Many thanks in advance
    Last edited by Matteo Pinna Pintor; 03 Sep 2018, 10:41.
    I'm using StataNow/MP 18.5

  • #2
    There is no reason you cannot loop over a numlist of lags in Stata. But you are going about it the wrong way.

    Code:
    L(1/`n').rcvy==1
    is a problem not because of the L(1/`n').rcvy construction, but because that construction expands to a series of things: L1.rcvy, L2.rcvy, etc. and then you place it on the opposite site of the == operator from a single number. The == operator accepts only a single expression on each side. What you wrote there is as illegal syntacticaly as -var1 var2 var3 == 1- would be.

    That said, I don't actually grasp what you are trying to do here. So I'm not going to try to fix your code. I'll just say that if you want to create a variable that indicates whether or not the `n' lags of a variable x are 1, it would be this:

    Code:
    gen nlags1 = L1.x == 1
    forvalues i = 2/`n' {
        replace nlags1 = nlags1 & (L`i'.x == 1)
    }
    The above code would, presumably be nested inside some wider loop on the value of `n'.

    Though not asked, I will also comment on
    [code]
    gen crisis = 1 if e0diff<0

    gen rcvy = 1 if !mi(e0diff) & e0diff>0 & l.crisis==1
    [code]
    This is legal syntax, but it is a bad programming practice and it is just a matter of time before it gets you into trouble. This creates crisis and rcvy as 1/missing values variables. Those don't work well with Stata's logical operators, which are really designed to work best with 1/0 variables. So you should replace those with:

    Code:
    gen crisis =  e0diff<0
    
    gen rcvy =  !mi(e0diff) & e0diff>0 & l.crisis==1

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      that construction expands to a series of things: L1.rcvy, L2.rcvy, etc. and then you place it on the opposite site of the == operator
      A-ha! - indeed. No way to expand into a multiplication I guess (l1.rcvy*l2.rcvy*...) - that would evaluate to [0,1] wouldn't it.

      Thanks for this and for reminding me about the gen best practice, for some reason I learned the wrong way and always end up replacing missing values with zeroes instead of coding it right to begin with. I thought about a nested loop but convinced myself that it can't do what i want. In light of the above, it probably can. Thanks for this too, will post again here if needed.
      I'm using StataNow/MP 18.5

      Comment


      • #4
        No way to expand into a multiplication I guess (l1.rcvy*l2.rcvy*...) - that would evaluate to [0,1] wouldn't it.
        Yes a running product, or for that matter a running maximum, would work for the purpose at hand. But each of those would also require a nested loop. I can't think of any reason to prefer either of these to my approach, nor mine to either of those.

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          But each of those would also require a nested loop
          Well yeah, I was trying to think of a one-line solution.
          Code:
          Anymatch(L(1/`n')), value(1)
          would evaluate to one iff the lags are equal to one, right? So that could be required to ==1. Is the fact that ereplace does not allow time series operators the only problem here?
          Last edited by Matteo Pinna Pintor; 04 Sep 2018, 04:45.
          I'm using StataNow/MP 18.5

          Comment


          • #6
            I'd look at tsegen (SSC) and rangestat (SSC) to see if they help. For example with the latter you can count negatives in moving windows and so identify windows with all values negative.

            Comment


            • #7
              Originally posted by Nick Cox View Post
              I'd look at tsegen (SSC) and rangestat (SSC) to see if they help. For example with the latter you can count negatives in moving windows and so identify windows with all values negative.
              Thanks. It is unfortunate that one cannot put expressions in the numlists of anymatch() and of time series operators, as that would allow to loop over tsegen with conditions that are functions of `n'. In my case, the variable part of the identifying condition for the `n' th year of recovery (the invariant being that e0diff is positive) is that the (1/`n'-1) lags are "recovery", and the 2`n'-1 lag is "crisis". No way to say that much in a single line?
              I'm using StataNow/MP 18.5

              Comment


              • #8
                Well yeah, I was trying to think of a one-line solution.
                Code:
                Anymatch(L(1/`n')), value(1)

                would evaluate to one iff the lags are equal to one, right? So that could be required to ==1. Is the fact that ereplace does not allow time series operators the only problem here?
                Well, actually, there is a semantics issue here. I would expect that a function named Anymatch(L(1/`n')), value(1) would return "true" if any of the preceding `n' lags had a value of 1, not just when all of them do. In fact, although it does not work with ranges of time series operator (it requires a list of variables as its argument), the existing -egen, anymatch()- function does just that: it returns 1 if any of the variables in its varlist matches any values in the -values()- option.

                One could imagine an -allmatch()- function, which would be simple enough to program for a varlist. Its non-existence after all these years suggests that demand for such a function may be infrequent. Perusing the list of existing -egen- functions, most take single variables or varlists as an argument, some take a single expression, but I don't see any that take multiple expressions, and nothing taking a variable number of expressions (which is what L(1/`n') would require). I think that, in general, it is complicated to program functions with a variable number of arguments, and that is probably the obstacle. But the fact that L(1/`n').x is acceptable in the independent variables of an estimation command shows that something like this can be done.


                Comment


                • #9
                  Sorry Clyde, I rushed here now hoping not to read that, as I have just realized I was not using anymatch() properly. I understand your general point. The reason i'm pursuing this is indeed due to the fact that L(1/`n').x is legit, hence perhaps the expectation that something can be done with it : ). Will work on it.
                  I'm using StataNow/MP 18.5

                  Comment


                  • #10
                    (Also, logically an "allmatch" over a binary variable can be mimicked by imposing the reverse condition with anymatch(), so probably programming it would not produce great benefits).
                    I'm using StataNow/MP 18.5

                    Comment

                    Working...
                    X