Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting string appearing in different rows

    Dear all,

    For example each row represents a unique licensing deal, I would like to know how to create a variable (i.e. stock_deal) that counts how many licensing deals a particular Licensee has with a particular Licensor in a 5-year time window. If you look at the table below, you see that
    • Licensee A has 2 unique licensing deals with licensor B in a 5-year time window (1999 & 2003 )
    • Licensee E has 2 unique licensing deals with Licensor C in a 5-year time window ( 1995 & 1997)
    Licensor Licensee year of licensing deal
    A B 1999
    C B 2000
    A B 2003
    C E 1995
    G E 1998
    C E 1997
    This is the result I want to become
    Licensor Licensee Year of licensing deal stock_deal
    A B 1999 2
    C B 2000 1
    A B 2003 2
    C E 1995 2
    G E 1998 1
    C E 1997 2
    I've tried "stock_deal = wordcount(Licensee !==" "), by Licensor" but this does not work and I don't find a syntax for a moving time window of 5 years. Does anyone know what I'm doing wrong?

    Thank you in advance.

    Yours sincerely,
    An

  • #2
    I'm not sure what you mean by a "five year window" here. Based on your example, it's not a forward window from the year in the observation, because then for A B 2003 you would not have 2, the only other deal being before 2003. Similarly it isn't a backward window from the year, because if it were, then A B 1999 would not be 2. So I'm going to guess you just mean how many deals between 5 years before and 5 years after (which, to my way of thinking as an 11-year centered window.)

    Anyway, when you hear yourself say the word "window" think of the -rangestat- command, written by Robert Picard, Roberto Ferrer, and Nick Cox. You can get it from SSC.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1(licensor licensee) int year
    "A" "B" 1999
    "C" "B" 2000
    "A" "B" 2003
    "C" "E" 1995
    "G" "E" 1998
    "C" "E" 1997
    end
    
    rangestat (count) stockdeal = year, interval(year -5 5) by(licensor licensee)
    While you're installing -rangestat- from SSC, also install -dataex-, and in the future, use it whenever you post example data. The kind of table you showed in your post can be difficult to import to Stata when it is necessary to test out some code on the data. Even when easily imported, as your was, the result is not necessarily faithful to what you have in terms of details like storage types, etc., that are sometimes quite important. Using -dataex- (-help dataex- for instructions) makes it possible for those who want to help you to create a completely faithful replica of your example in Stata using just a simple copy/paste operation.

    Comment


    • #3
      Dear Clyde,

      Thank you for your reply!

      What I mean with 5-year time window is, for instance, the number of licensing deals from 2000-2005 but you can also have a period ranging from 19995-2000 and so on. So within a time frame of 5 years I would like to count the number of licensing deals for a given licensee with a certain licensor.

      I hope I was clearly enough. Feel free to let me know if I'm still unclear.

      Comment


      • #4
        But that's not consistent with the example you showed. In the 1995-2000 window A and B only have one deal, and they also only have one deal between 2000 and 2005, but you show 2 for both entries involving them.

        Try my code in your data. Even though we're having some difficulty communicating the requirements in words, when I ran that command on your example, it produces exactly the results you said you wanted there.

        Comment


        • #5
          Dear Clyde,

          It works perfectly! Thank you so much and I'm so sorry for my vagueness!

          Have a nice day!!

          Comment

          Working...
          X