Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to write condition for calculating spells?

    Hi. I have a dataset with the revenues of firms through time, monthly. For example:

    Code:
    date    id       revenue
    2018m7 27712 3786.29
    2018m8 27712 4236.72
    2018m9 27712 4169.7
    2018m10 27712 4539.59
    2018m11 27712 5454.84
    2018m12 27712 6897.49
    2019m1 27712 6770.24
    2019m2 27712 7516.784
    2017m7 39566 30.38
    2017m6 44155 336.15
    2017m7 44155 623.25
    I would like to calculate how many months go (on average) before a firm "dissapears" from the records for at least three months (irrespective of whether it then appears again).
    This is because the firm I work for earns a commision on the revenue of those other firms, hence we want to calculate some indicator of the "duration" of that stream of income before it stops (if you have other ideas regarding that calculation, suggestions are very welcome).

    I have come across the user-written command tsspell, and it has proven to be useful, but I am having problems defining the spells with the condition of "at least three months". The command counts a spell as soon as there is a one-month gap between observations, for example:


    Code:
    tsset id fecha, monthly
    
    ssc install tsspell
    
    tsspell id
    and I get:

    Code:
    date       id       _spell   _seq   _end
    2018m3 1101931       1       1        0
    2018m4 1101931       1       2        0
    2018m5 1101931       1       3        1
    2018m7 1101931       2       1        1
    2018m10 1101931     3       1        1
    2019m2 1101931       4       1        1
    What I would like to see is:

    Code:
    date       id       _spell   _seq   _end
    2018m3 1101931       1       1        0
    2018m4 1101931       1       2        0
    2018m5 1101931       1       3        0
    2018m7 1101931       1       4        0
    2018m10 1101931     1       5        1
    2019m2 1101931       2       1        0
    Since it is only in 2019m2 for 1101931 that the three-month condition is met. I can then calculate the max for each _seq by firm, and calculate the average.

    I realize that the calculating of the max includes the spells when the firms "reappear", but that is fine by me (if you have suggestions for calculating the standard deviation of those spells, taking into account that some spells are from the same firms that disappear and then reappear, that would be a very nice extra...)

    Thank you very much.
    Last edited by Juan Gambetta; 21 May 2019, 14:36.

  • #2
    It's a little difficult to get to grips with this, as I don't understand what your variable fecha is or does. But my understanding is in essence that the default definition of spell is not what you want from tsspell. So, say what you do want.

    Here I've recast one of your data examples using dataex for reasons spelled out in the FAQ Advice: dates as just listed are harder work to input.

    One way to work is to set up your own time variable ignoring gaps but look for gaps of at least 3 months since the previous record.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(id date)
    1101931 698
    1101931 699
    1101931 700
    1101931 702
    1101931 705
    1101931 709
    end
    format %tm date
    
    bysort id (date) : gen time = _n 
    tsset id time 
    tsspell, fcond(D.date > 3) 
    
    list, sepby(id _spell) 
    
         +-------------------------------------------------+
         |      id      date   time   _spell   _seq   _end |
         |-------------------------------------------------|
      1. | 1101931    2018m3      1        1      1      0 |
      2. | 1101931    2018m4      2        1      2      0 |
      3. | 1101931    2018m5      3        1      3      0 |
      4. | 1101931    2018m7      4        1      4      0 |
      5. | 1101931   2018m10      5        1      5      1 |
         |-------------------------------------------------|
      6. | 1101931    2019m2      6        2      1      1 |
         +-------------------------------------------------+
    Here's another way to do it, with a more orthodox tsset. When gaps are present, the lapse since the previous observation must be calculated with subscripts, not time series operators.
    . .
    Code:
     
    . drop _*
    
    . tsset id date
           panel variable:  id (strongly balanced)
            time variable:  date, 2018m3 to 2019m2, but with gaps
                    delta:  1 month
    
    . bysort id (date) : gen gap = date - date[_n-1]
    (1 missing value generated)
    
    . tsspell , fcond(gap > 3)
    warning: data contain gaps; see help on tsspell
    
    . list , sepby(id _spell)
    
         +-------------------------------------------------------+
         |      id      date   time   gap   _spell   _seq   _end |
         |-------------------------------------------------------|
      1. | 1101931    2018m3      1     .        1      1      0 |
      2. | 1101931    2018m4      2     1        1      2      0 |
      3. | 1101931    2018m5      3     1        1      3      0 |
      4. | 1101931    2018m7      4     2        1      4      0 |
      5. | 1101931   2018m10      5     3        1      5      1 |
         |-------------------------------------------------------|
      6. | 1101931    2019m2      6     4        2      1      1 |
         +-------------------------------------------------------+
    .

    tsspell warns of the gaps, but you know about them. Same answer in terms of spell definitions.

    Comment


    • #3
      Hi Nick. Sorry, I corrected the code in all places but in tsset. "fecha" means date in spanish. It is the monthly time variable. Thank you very much, you have solved my problem. I used your second suggestion.

      I will also take into account the dataex command.

      Thanks again,
      Juan

      Comment

      Working...
      X