Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help needed for conditional counting

    Hi everybody
    It is my very first post on Statalist, so please forgive me for my elementary problems in building formulas.
    I'm working on a dataset that includes information about stock ownership for several firms over time.
    I have several identifiers for firms, owners, quarters, years and the percentage of ownership. I just need to count the number of quarters in which the ownership of a specific person in a specific firm is stable or increasing. The dataset is widely unbalanced so the period under examination changes for individuals and firms. I tried to concatenate "firms" and "owner" and I built a counter for the quarters to have the xtset identifiers, but then my trials with lagged values and so on were uneffective.
    Thanks in advance for any suggestion.

  • #2
    Welcome to Statalist.

    I'm afraid your description of your problem isn't enough to make either your problem or your data clear, at least to me. That's no surprise: the common languages on Statalist are Stata and Data, and it's more effective to see your data than to read a description of it in non-programming language.

    Perhaps someone else will understand better. If not, please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post, looking especially at sections 9-12 on how to best pose your question. It would be particularly helpful to post a small hand-made example, showing the data and what you expect your process to result in.

    In particular, please read FAQ #12 and use dataex to post your sample data to Statalist. That way members interested in helping can easily use your sample data to test and demonstrate solutions to your problem.

    The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

    Comment


    • #3
      Thanks for your reply, William... and sorry for the fault start...
      First of all, I'm using Stata 11.2 for Mac.
      I try to include a simplified sample. Columns are firm, owner, period and percentage of ownership respectively.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str2 firm str8 owner int period float perc
      "a" "Me"  2 2.5
      "a" "Me"  3 3.5
      "b" "Me"  1 1.5
      "b" "Me"  2 2.5
      "b" "Me"  3 2.5
      "c" "Me"  2 5.5
      "c" "Me"  3 4.5
      "b" "You" 1 4.5
      "b" "You" 2 2.5
      "b" "You" 3 3.5
      "c" "You" 2 5.5
      "c" "You" 3 6.5
      end
      I'm trying to obtain the number of quarters in which perc is not decreasing for each firm-owner combination across the time period under investigation.
      In the example, I expect to obtain 2 for firm "a" and owner "Me", 3 for firm "b" and owner "Me", 1 for "c"-"Me" (after the first quarter perc immediately decreases), 2 for "b"-"You" (this is the trickiest one, since perc decreases and then increases, leaving two consecutive periods of non-decreasing ownership) and 2 for "c"-"You"...
      Last edited by Simone Rossi; 12 Jan 2018, 13:51.

      Comment


      • #4
        Thank you for the excellent presentation of your example. Here is some sample code whose results agree with what you requested.
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str2 firm str8 owner int period float perc
        "a" "Me"  2 2.5
        "a" "Me"  3 3.5
        "b" "Me"  1 1.5
        "b" "Me"  2 2.5
        "b" "Me"  3 2.5
        "c" "Me"  2 5.5
        "c" "Me"  3 4.5
        "b" "You" 1 3.5
        "b" "You" 2 4.5
        "b" "You" 3 2.5
        "c" "You" 2 5.5
        "c" "You" 3 6.5
        end
        sort owner firm period
        // confirm there aren't any gaps in the period
        by owner firm (period): assert period==(period[_n-1]+1) if _n>1
        // up?
        by owner firm (period): generate up = perc>=perc[_n-1] | _n==1
        // count
        by owner firm (period): egen notdown = total(up)
        list, sepby(owner firm) noobs
        Code:
        . list, sepby(owner firm) noobs
        
          +---------------------------------------------+
          | firm   owner   period   perc   up   notdown |
          |---------------------------------------------|
          |    a      Me        2    2.5    1         2 |
          |    a      Me        3    3.5    1         2 |
          |---------------------------------------------|
          |    b      Me        1    1.5    1         3 |
          |    b      Me        2    2.5    1         3 |
          |    b      Me        3    2.5    1         3 |
          |---------------------------------------------|
          |    c      Me        2    5.5    1         1 |
          |    c      Me        3    4.5    0         1 |
          |---------------------------------------------|
          |    b     You        1    3.5    1         2 |
          |    b     You        2    4.5    1         2 |
          |    b     You        3    2.5    0         2 |
          |---------------------------------------------|
          |    c     You        2    5.5    1         2 |
          |    c     You        3    6.5    1         2 |
          +---------------------------------------------+
        I included code to check that there weren't any gaps in the period because I wasn't quite sure what to make of, say, perc==4 in period 2 and perc==2 in period 4 - did it go down to 1 and up to 2, or did it go down to 3 and then down to 2, or ???. So if a gap in the period is possible, you'll have to think about what you want. The idea to the assert command is that if the condition fails, the do-file will stop. I use that a lot to confirm my assumptions.
        Last edited by William Lisowski; 12 Jan 2018, 14:03.

        Comment


        • #5
          This is the typical case in which somebody can solve in a minute a problem that severely affects your night dreams... Thank you very much William (and all the others that provide a valuable service to the Stata-rookie population, that naturally includes me :-)

          Comment


          • #6
            I'm writing a separate post to call attention to the fact that I corrected the code in post #4 after initially posting it, because I remembered something that usually has to be pointed out to me: help egen tells us

            Explicit subscripting (using _N and _n), which is commonly used with generate, should not be used with egen.
            and my initial version of the code had generate combined into the egen , which gave the correct answer anyhow. Sigh.

            Comment


            • #7
              Very useful (and easy to undestand-replicate)... i tried it on my 760,143 observation dataset and it works perfectly. Thanks a lot also for introducing me to assert command; it is really useful in "checking" huge datasets like the one I'm using just now.

              Comment


              • #8
                I was thinking about missing data issue… since I want to expand my dataset, it is expected that some missing values will occur in my future panel.
                In that case, it would be useful to have the maximum value of consecutive non-decreasing ownership for each investor-firm combination. For example in the following panel:
                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input str2 firm str8 owner int period float perc
                "a" "Me"  1 2.5
                "a" "Me"  2 3.5
                "a" "Me"  4 1.5
                "a" "Me"  6 2.5
                "a" "Me"  7 2.5
                "a" "Me"  8 5.5
                "b" "You" 1 4.5
                "b" "You" 2 4.5
                "b" "You" 3 5.5
                "b" "You" 5 3.5
                "b" "You" 7 5.5
                "b" "You" 8 6.5
                end
                i would like to receive 3 as a result for both a-Me and b-You combination.
                As a further check i tried to introduce some other variables: the first isolates data/missing data presence and the second counts consecutive available periods...

                * Building variable for checking missing values presence
                bysort firm owner (period): gen miss=period[_n]!=period[_n-1]+1
                bysort firm owner (period): replace miss=period[_n]!=period[_n+1]-1 if miss==1
                egen max_miss=max(miss), by(firm owner)
                bysort firm owner (period): egen tot_miss = total(miss)

                * Building variable for counting consecutive values
                bysort firm owner (period) : gen conseq=period[_n]==period[_n-1]+1
                bysort firm owner (period): replace conseq=period[_n]==period[_n+1]-1 if conseq==0

                The last step is finding the variable that traces the maximum value of non-decreasing consecutive ownership…
                Thank you again

                Comment


                • #9
                  I chose a slightly different approach that built off the following insight.
                  • In post #3 you want to count the number of periods of non-decreasing ownership within each owner/firm combination
                  • In post #8 you want to
                    • divide each firm/owner combination into spells of consecutive periods
                    • count the number of periods of non-decreasing ownership within each owner/firm/spell combination
                    • determine the maximum value of of that count for each firm/owner combination
                  So I built the code below off my previous code, but subdividing owner/firm combinations into spells.

                  Now, I may have misunderstood you. You wrote in post #8 that you wanted the "maximum value of consecutive non-decreasing ownership".

                  I have not taken this to mean you want the "maximum, across the spells of consecutive ownership for each owner/firm combination, of the number of non-decresing periods within each spell". Thus if periods 1-5 have up-down-up-up-up and there is no period 6, I count 4 from that spell, even through the up periods are not consecutive. That seems the logical extension of what you were looking for in post #3. To demonstrate this, I added a new owner/firm to your data.

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input str2 firm str8 owner int period float perc
                  "a" "Me"  1 2.5
                  "a" "Me"  2 3.5
                  "a" "Me"  4 1.5
                  "a" "Me"  6 2.5
                  "a" "Me"  7 2.5
                  "a" "Me"  8 5.5
                  "b" "You" 1 4.5
                  "b" "You" 2 4.5
                  "b" "You" 3 5.5
                  "b" "You" 5 3.5
                  "b" "You" 7 5.5
                  "b" "You" 8 6.5
                  "z" "Zem" 1 3.5
                  "z" "Zem" 2 2.5
                  "z" "Zem" 3 4.5
                  "z" "Zem" 4 5.5
                  "z" "Zem" 5 6.0
                  "z" "Zem" 7 4.2
                  end
                  // divide owner/firm into consecutive spells
                  bysort owner (firm period): generate newspell = period!=(period[_n-1]+1)
                  bysort owner: generate spell = sum(newspell)
                  // spell length
                  bysort owner firm spell (period): generate conseq = _N
                  // up?
                  bysort owner firm spell (period): generate up = perc>=perc[_n-1] | _n==1
                  // count
                  bysort owner firm spell (period): egen ups = total(up)
                  // max by owener/firm
                  bysort owner firm: egen notdown = max(ups)
                  list, sepby(owner firm spell) noobs
                  Code:
                  . list, sepby(owner firm spell) noobs
                  
                    +-------------------------------------------------------------------------------+
                    | firm   owner   period   perc   newspell   spell   conseq   up   ups   notdown |
                    |-------------------------------------------------------------------------------|
                    |    a      Me        1    2.5          1       1        2    1     2         3 |
                    |    a      Me        2    3.5          0       1        2    1     2         3 |
                    |-------------------------------------------------------------------------------|
                    |    a      Me        4    1.5          1       2        1    1     1         3 |
                    |-------------------------------------------------------------------------------|
                    |    a      Me        6    2.5          1       3        3    1     3         3 |
                    |    a      Me        7    2.5          0       3        3    1     3         3 |
                    |    a      Me        8    5.5          0       3        3    1     3         3 |
                    |-------------------------------------------------------------------------------|
                    |    b     You        1    4.5          1       1        3    1     3         3 |
                    |    b     You        2    4.5          0       1        3    1     3         3 |
                    |    b     You        3    5.5          0       1        3    1     3         3 |
                    |-------------------------------------------------------------------------------|
                    |    b     You        5    3.5          1       2        1    1     1         3 |
                    |-------------------------------------------------------------------------------|
                    |    b     You        7    5.5          1       3        2    1     2         3 |
                    |    b     You        8    6.5          0       3        2    1     2         3 |
                    |-------------------------------------------------------------------------------|
                    |    z     Zem        1    3.5          1       1        5    1     4         4 |
                    |    z     Zem        2    2.5          0       1        5    0     4         4 |
                    |    z     Zem        3    4.5          0       1        5    1     4         4 |
                    |    z     Zem        4    5.5          0       1        5    1     4         4 |
                    |    z     Zem        5      6          0       1        5    1     4         4 |
                    |-------------------------------------------------------------------------------|
                    |    z     Zem        7    4.2          1       2        1    1     1         4 |
                    +-------------------------------------------------------------------------------+

                  Comment


                  • #10
                    Thanks again William. In the computation is true the assumption that i want to find the "maximum, across the spells of consecutive ownership for each owner/firm combination, of the number of consecutive non-decresing periods within each spell". This mean that in z-zem i see 3 periods of that kind (from period 3 to 5). The reason is that i trying to implement a measure of ownership stability known in literature of banking that associate the concept of "stable ownership" to the attitude of an investors toward increasing or at least maintaining its share holdings...

                    Comment


                    • #11
                      If per post #10 z-Zem is 3 for periods 3-5 then it seems to me that in post #8 you should say b-You is 2 not 3. Why do you treat b-You periods 1-3 differently than z-Zem periods 3-5?

                      Comment


                      • #12
                        You are absolutely right!
                        Z-zem is 4; the issue is with 1-0 associated with "up" in period 1-2 (that, following the rules that we are discussing, should be inverted…). For z-zem I expect 0-1-1-1-1 for the 5 period in column Up. Probably it should be easier (and correct) following a different approach in which the first period is considered useless for the counting function: in effect when i have a equal or growing pattern of data, between two observation I have only 1 period of time. In this case I don't need to use the replace command that is confusing the results in this example. However i have the same issue in a long sequence of up-up-up-down-down-up-up-up where in a certain sense, each down interrupt the series of data like a missing value.
                        Do you think that it can work fine?
                        Code:
                        bysort owner firm (period): generate up = perc>=perc[_n-1]
                        bysort owner (firm period): generate newspell = period!=(period[_n-1]+1) | up==0
                        bysort owner: generate spell = sum(newspell)
                        bysort owner firm spell (period): generate conseq = _N
                        bysort owner firm spell (period): egen ups = total(up)
                        bysort owner firm: egen notdown = max(ups)
                        list, sepby(owner firm spell) noobs
                        Last edited by Simone Rossi; 15 Jan 2018, 13:47.

                        Comment


                        • #13
                          This code corrects at least one error in my previous code - I did a "by owner (firm period)" which should have been "by owner firm (period)".

                          The approach is the same as yours in post #12 - to start a spell whenever either there's a skipped period or a drop in ownership. But since each spell is nondecreasing, we don't have to count - just take 1 less than its length.

                          Code:
                          cls
                          * Example generated by -dataex-. To install: ssc install dataex
                          clear
                          input str2 firm str8 owner int period float perc
                          "a" "Me"  1 2.5
                          "a" "Me"  2 3.5
                          "a" "Me"  4 1.5
                          "a" "Me"  6 2.5
                          "a" "Me"  7 2.5
                          "a" "Me"  8 5.5
                          "b" "You" 1 4.5
                          "b" "You" 2 4.5
                          "b" "You" 3 5.5
                          "b" "You" 5 3.5
                          "b" "You" 7 5.5
                          "b" "You" 8 6.5
                          "z" "Zem" 1 3.5
                          "z" "Zem" 2 2.5
                          "z" "Zem" 3 4.5
                          "z" "Zem" 4 5.5
                          "z" "Zem" 5 6.0
                          "z" "Zem" 7 4.2
                          end
                          // divide owner/firm into spells of consecutive periods of non-decreasing holdings
                          generate newspell = 0
                          // not consecutive - start a new spell
                          bysort owner firm (period): replace newspell = 1 if period!=(period[_n-1]+1)
                          // holdings decrease - start a new spell
                          bysort owner firm (period): replace newspell = 1 if perc<perc[_n-1]
                          // count spells by owner/firm
                          bysort owner firm (period): generate spell = sum(newspell)
                          // length of spell minus 1 gives number of period-to-period increases
                          bysort owner firm spell (period): generate ups = _N-1
                          // max by owner/firm
                          bysort owner firm: egen notdown = max(ups)
                          list, sepby(owner firm spell) noobs
                          Code:
                          . list, sepby(owner firm spell) noobs
                          
                            +-----------------------------------------------------------------+
                            | firm   owner   period   perc   newspell   spell   ups   notdown |
                            |-----------------------------------------------------------------|
                            |    a      Me        1    2.5          1       1     1         2 |
                            |    a      Me        2    3.5          0       1     1         2 |
                            |-----------------------------------------------------------------|
                            |    a      Me        4    1.5          1       2     0         2 |
                            |-----------------------------------------------------------------|
                            |    a      Me        6    2.5          1       3     2         2 |
                            |    a      Me        7    2.5          0       3     2         2 |
                            |    a      Me        8    5.5          0       3     2         2 |
                            |-----------------------------------------------------------------|
                            |    b     You        1    4.5          1       1     2         2 |
                            |    b     You        2    4.5          0       1     2         2 |
                            |    b     You        3    5.5          0       1     2         2 |
                            |-----------------------------------------------------------------|
                            |    b     You        5    3.5          1       2     0         2 |
                            |-----------------------------------------------------------------|
                            |    b     You        7    5.5          1       3     1         2 |
                            |    b     You        8    6.5          0       3     1         2 |
                            |-----------------------------------------------------------------|
                            |    z     Zem        1    3.5          1       1     0         3 |
                            |-----------------------------------------------------------------|
                            |    z     Zem        2    2.5          1       2     3         3 |
                            |    z     Zem        3    4.5          0       2     3         3 |
                            |    z     Zem        4    5.5          0       2     3         3 |
                            |    z     Zem        5      6          0       2     3         3 |
                            |-----------------------------------------------------------------|
                            |    z     Zem        7    4.2          1       3     0         3 |
                            +-----------------------------------------------------------------+

                          Comment


                          • #14
                            … and it solves also the problem of missing data and interrupted growth series… thanks again!

                            Comment

                            Working...
                            X