Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting the number of firms with consecutive observative observations

    Dear Members,

    I am dealing with an unbalanced panel where the firm identified is called "permid" and the time variable is "wave". The number of waves is five.

    I would like to count how many firms, "permid", have at least two consecutive observations. I would thus be able to identity how many firms have either two, three, four or five consecutive observations.

    This is to understand the size of the panel and evaluate the possibility to perform a dynamic analysis.

    I am thinking to start with this to understand how many are duplicate:

    Code:
    bysort permid:  gen dup1 = cond(_N==1,0,_n)
    tab dup1
    keep if dup1>0
    but there might be something easier I should use.

    I would be very thankful for any suggestion.


  • #2
    Marco, could you provide a sample of your data using dataex (SSC)?
    Knowing how your variable "wave" is encoded would be helpful to provide you any code.
    If they are date, or just integer would lead to various codes.

    Assuming they're simple integer from 1 to 5
    Code:
    bysort permid (wave) : gen conseq=wave[_n]==wave[_n-1]+1
    bysort permid (wave) : replace conseq=wave[_n]==wave[_n+1]-1 if conseq==0
    
    bysort permid : egen nb_conseq=total(conseq)
    See example below :
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(permid wave)
    1 1
    1 2
    1 3
    1 4
    1 5
    2 2
    2 3
    2 4
    3 1
    3 3
    3 4
    3 5
    4 1
    4 3
    4 5
    end
    
    bysort permid (wave) : gen conseq=wave[_n]==wave[_n-1]+1
    bysort permid (wave) : replace conseq=wave[_n]==wave[_n+1]-1 if conseq==0
    
    bysort permid : egen nb_conseq=total(conseq)
    tab conseq
    Hope this helps,
    Charlie

    Comment


    • #3
      Dear Charlie,
      Thank you very much for your kind and prompt reply.

      I installed dataex and the result of the command is provided below:

      . dataex wave, count(10)

      ----------------------- copy starting from the next line -----------------------
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte wave
      12
      13
      14
      15
      11
      12
      13
      15
      11
      15
      end
      ------------------ copy up to and including the previous line ------------------

      Listed 10 out of 44513 observations


      The number of waves I am interest goes from 11 to 15.

      I tried to use the code you kindly provided me at the bottom of the page.

      If I use the following commands:

      Code:
      bysort permid (wave) : gen conseq=wave[_n]==wave[_n-1]+1
      bysort permid (wave) : replace conseq=wave[_n]==wave[_n+1]-1 if conseq==0
      bysort permid : egen nb_conseq=total(conseq)
      tab conseq
      I get this result

      Code:
      . tab conseq
      
           conseq |      Freq.     Percent        Cum.
      ------------+-----------------------------------
                0 |     18,592       41.77       41.77
                1 |     25,921       58.23      100.00
      ------------+-----------------------------------
            Total |     44,513      100.00
      whereas if I

      Code:
      tab nb_conseq
      I get this result

      Code:
      . tab nb_conseq
      
        nb_conseq |      Freq.     Percent        Cum.
      ------------+-----------------------------------
                0 |     16,471       37.00       37.00
                2 |     13,187       29.63       66.63
                3 |      6,443       14.47       81.10
                4 |      4,752       10.68       91.78
                5 |      3,660        8.22      100.00
      ------------+-----------------------------------
            Total |     44,513      100.00
      I believe that the tables provide the numbers of observations, but not of the permid which is a unique firm identifier. If I open the data editor 44,513 is the number of observations and the number of firms should be much smaller.

      Comment


      • #4
        So first you need to create a variable that distinguishes a single observation out of each group of observations for a permno. That's what -egen, tag()- does.

        Code:
        egen flag = tag(permno)
        tab nb_conseq if flag

        Comment


        • #5
          Dear Clyde,

          Thank you very much for your kind help.

          I run the code kindly provided by Charlie but first doing the permno.

          Here is the code I used:

          Code:
          egen flag = tag(permid)
          bysort permid (wave) : gen conseq=wave[_n]==wave[_n-1]+1
          bysort permid (wave) : replace conseq=wave[_n]==wave[_n+1]-1 if conseq==0
          bysort permid : egen nb_conseq=total(conseq)
          tab nb_conseq if flag
          The results are the following

          Code:
            nb_conseq |      Freq.     Percent        Cum.
          ------------+-----------------------------------
                    0 |      7,797       44.62       44.62
                    2 |      5,768       33.01       77.62
                    3 |      1,991       11.39       89.01
                    4 |      1,188        6.80       95.81
                    5 |        732        4.19      100.00
          ------------+-----------------------------------
                Total |     17,476      100.00
          I do thank you both for your precious help. Many many thanks.

          Comment

          Working...
          X