Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • The percentage of market overlapping between two firms over the past 5 years (reshape based on rolling windows)

    Dear all,


    I am trying to figure out how to calculate the proportion of market overlap between firm I and firm J over the past 5 years (excluding the current year). Ultimately, I'd like to use the resulting value as a weighting factor that gives more weight to a firm that is more similar to the focal firm.


    My original data looks like the below (please see the top one):

    no indicates the number of markets a focal firm I entered at year T (e.g., 2 means that the firm entered two different markets at T).
    market indicates the ID of each market. Max number of this ID is 15, implying that there are 15 types of markets.


    In order to perform one-to-one matching, I changed the data structure using 'joinby' command. There should be a more efficient, clever way to perform this task, but this is what I came up with (Please see the second dataset).

    Here are the codes I used.

    u t300.dta, clear // the first dateset
    ren * *2
    ren year2 year
    joinby year using "t300.dta"
    drop if firm==firm2
    ren no no1
    order firm firm2 year market market2 no1 no2
    sort firm firm2 year market market2

    bys firm firm2 year: egen intersection = count(market) if market==market2
    sort firm firm2 year intersection
    by firm firm2 year: replace intersection = intersection[_n-1] if missing(intersection)
    replace intersection=0 if missing(intersection)
    gen overlap = intersection / (no1+no2-intersection)


    As a result, I restructured the dataset like the one on the bottom of this page.

    firm2: firm J (other firms)
    market2: market ID of firm J
    no2: number of markets firm J entered at T
    intersection: number of overlapped markets between firm I and firm J at T
    overlap: % of overlapped markets between firm I and firm J at T


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(firm year no market)
     237 2005 1  3
     237 2006 1  3
     237 2008 1  6
     237 2009 1  3
     237 2010 1  6
     237 2011 1  1
     237 2012 1  1
     237 2013 1  6
     237 2014 2  3
     237 2014 2  6
    1694 2001 1  9
    1694 2002 3  5
    1694 2002 3  6
    1694 2002 3  9
    1694 2004 1  9
    1694 2005 2  5
    1694 2005 2  6
    1694 2006 1  9
    1694 2007 1  9
    1694 2010 1  9
    2263 2009 2  1
    2263 2009 2  9
    2263 2010 1  9
    2263 2011 2  9
    2263 2011 2 15
    2263 2012 2  9
    2263 2012 2 15
    2263 2013 1 15
    2263 2015 2  9
    2263 2015 2 15
    2814 2001 2  6
    2814 2001 2  9
    2814 2004 1  6
    2814 2005 2  6
    2814 2005 2  9
    2814 2008 1  9
    2814 2012 1  9
    2814 2013 1  7
    2814 2014 1  9
    2814 2015 1  9
    3741 2009 1  9
    3741 2010 1  9
    3741 2011 1  9
    3741 2012 1  9
    3741 2013 1  9
    3741 2014 3  6
    3741 2014 3  7
    3741 2014 3  9
    3741 2015 2  7
    3741 2015 2  9
    4027 2007 1  6
    4027 2008 1  5
    4027 2009 2  7
    4027 2009 2  9
    4027 2010 2  1
    4027 2010 2  9
    4027 2011 1  7
    4027 2012 1  9
    4027 2013 1  6
    4027 2014 1  7
    5425 2002 1  6
    5425 2003 1  9
    5425 2004 1  9
    5425 2006 1  9
    5425 2008 1  9
    5425 2009 1  6
    5425 2010 1  9
    5425 2012 1  9
    5425 2013 1  9
    5425 2015 1  6
    5651 2003 1  1
    5651 2004 1  1
    5651 2005 4  1
    5651 2005 4  5
    5651 2005 4  7
    5651 2005 4  9
    5651 2006 2  6
    5651 2006 2  9
    5651 2008 2  6
    5651 2008 2  7
    5885 2000 1  7
    5885 2001 1  9
    5885 2002 2  6
    5885 2002 2  9
    5885 2003 1  1
    5885 2006 1  3
    5885 2008 1  1
    5885 2010 1  6
    5885 2013 1  6
    5885 2014 1  9
    6594 2004 1  6
    6594 2007 1  1
    6594 2010 1  1
    6594 2011 2  1
    6594 2011 2  3
    6594 2013 2  1
    6594 2013 2  3
    6594 2014 2  6
    6594 2014 2 15
    6594 2015 1  6
    end



    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(firm firm2 year market market2 no1 no2 intersection overlap)
     237 1694 2005 3  5 1 2 0        0
     237 1694 2005 3  6 1 2 0        0
     237 1694 2006 3  9 1 1 0        0
     237 1694 2010 6  9 1 1 0        0
     237 2263 2009 3  9 1 2 0        0
     237 2263 2009 3  1 1 2 0        0
     237 2263 2010 6  9 1 1 0        0
     237 2263 2011 1 15 1 2 0        0
     237 2263 2011 1  9 1 2 0        0
     237 2263 2012 1  9 1 2 0        0
     237 2263 2012 1 15 1 2 0        0
     237 2263 2013 6 15 1 1 0        0
     237 2814 2005 3  9 1 2 0        0
     237 2814 2005 3  6 1 2 0        0
     237 2814 2008 6  9 1 1 0        0
     237 2814 2012 1  9 1 1 0        0
     237 2814 2013 6  7 1 1 0        0
     237 2814 2014 3  9 2 1 0        0
     237 2814 2014 6  9 2 1 0        0
     237 3741 2009 3  9 1 1 0        0
     237 3741 2010 6  9 1 1 0        0
     237 3741 2011 1  9 1 1 0        0
     237 3741 2012 1  9 1 1 0        0
     237 3741 2013 6  9 1 1 0        0
     237 3741 2014 6  6 2 3 1      .25
     237 3741 2014 3  6 2 3 1      .25
     237 3741 2014 3  7 2 3 1      .25
     237 3741 2014 3  9 2 3 1      .25
     237 3741 2014 6  7 2 3 1      .25
     237 3741 2014 6  9 2 3 1      .25
     237 4027 2008 6  5 1 1 0        0
     237 4027 2009 3  9 1 2 0        0
     237 4027 2009 3  7 1 2 0        0
     237 4027 2010 6  9 1 2 0        0
     237 4027 2010 6  1 1 2 0        0
     237 4027 2011 1  7 1 1 0        0
     237 4027 2012 1  9 1 1 0        0
     237 4027 2013 6  6 1 1 1        1
     237 4027 2014 6  7 2 1 0        0
     237 4027 2014 3  7 2 1 0        0
     237 5425 2006 3  9 1 1 0        0
     237 5425 2008 6  9 1 1 0        0
     237 5425 2009 3  6 1 1 0        0
     237 5425 2010 6  9 1 1 0        0
     237 5425 2012 1  9 1 1 0        0
     237 5425 2013 6  9 1 1 0        0
     237 5651 2005 3  1 1 4 0        0
     237 5651 2005 3  7 1 4 0        0
     237 5651 2005 3  9 1 4 0        0
     237 5651 2005 3  5 1 4 0        0
     237 5651 2006 3  6 1 2 0        0
     237 5651 2006 3  9 1 2 0        0
     237 5651 2008 6  6 1 2 1       .5
     237 5651 2008 6  7 1 2 1       .5
     237 5885 2006 3  3 1 1 1        1
     237 5885 2008 6  1 1 1 0        0
     237 5885 2010 6  6 1 1 1        1
     237 5885 2013 6  6 1 1 1        1
     237 5885 2014 3  9 2 1 0        0
     237 5885 2014 6  9 2 1 0        0
     237 6594 2010 6  1 1 1 0        0
     237 6594 2011 1  1 1 2 1       .5
     237 6594 2011 1  3 1 2 1       .5
     237 6594 2013 6  1 1 2 0        0
     237 6594 2013 6  3 1 2 0        0
     237 6594 2014 6  6 2 2 1 .3333333
     237 6594 2014 3  6 2 2 1 .3333333
     237 6594 2014 6 15 2 2 1 .3333333
     237 6594 2014 3 15 2 2 1 .3333333
    1694  237 2005 5  3 2 1 0        0
    1694  237 2005 6  3 2 1 0        0
    1694  237 2006 9  3 1 1 0        0
    1694  237 2010 9  6 1 1 0        0
    1694 2263 2010 9  9 1 1 1        1
    1694 2814 2001 9  9 1 2 1       .5
    1694 2814 2001 9  6 1 2 1       .5
    1694 2814 2004 9  6 1 1 0        0
    1694 2814 2005 6  6 2 2 1 .3333333
    1694 2814 2005 5  9 2 2 1 .3333333
    1694 2814 2005 5  6 2 2 1 .3333333
    1694 2814 2005 6  9 2 2 1 .3333333
    1694 3741 2010 9  9 1 1 1        1
    1694 4027 2007 9  6 1 1 0        0
    1694 4027 2010 9  9 1 2 1       .5
    1694 4027 2010 9  1 1 2 1       .5
    1694 5425 2002 6  6 3 1 1 .3333333
    1694 5425 2002 5  6 3 1 1 .3333333
    1694 5425 2002 9  6 3 1 1 .3333333
    1694 5425 2004 9  9 1 1 1        1
    1694 5425 2006 9  9 1 1 1        1
    1694 5425 2010 9  9 1 1 1        1
    1694 5651 2004 9  1 1 1 0        0
    1694 5651 2005 5  5 2 4 1       .2
    1694 5651 2005 5  1 2 4 1       .2
    1694 5651 2005 5  9 2 4 1       .2
    1694 5651 2005 6  7 2 4 1       .2
    1694 5651 2005 6  5 2 4 1       .2
    1694 5651 2005 5  7 2 4 1       .2
    1694 5651 2005 6  9 2 4 1       .2
    1694 5651 2005 6  1 2 4 1       .2
    end

    This allowed me to calculate the % of market overlap between firm I and firm j at T, but I'd like to know how I can calculate the % of market overlap between firm I and firm J over the past 5 years. Since I have more than 100,000 observations, using joinby command to restructure the dataset like this takes tremendous time.

    I can still use joinby command as long as I can figure out how to reshape the dataset based on rolling windows.
    Would it be possible to reshape the dataset based on rolling windows? For example, firm 237 entered into market 3 in 2005, market 3 in 2006, and market 6 in 2008.

    firm year no market
    237 2005 1 3
    237 2006 1 3
    237 2008 1 6

    237 2009 1 3

    I would like to reshape this data something like this:

    firm year market1 market2 market3 (the variable market3 here does not necessarily mean market ID 3 and just indicates the third market the firm entered over the past 5 years)
    237 2009 3 6 .

    Since the firm entered market ID 3 and 6 over the past 5 years, I want to have a wide form of dataset showing 'which markets' the firm entered over the past 5 years.
    I would greatly appreciate it if you give me any suggestions and advice. I definitely think that there is a more efficient way to perform this task.


    I am desperately looking forward to your response.
    Thank you in advance for your help.



    Best regards,


    Anna

  • #2
    Because you are looking at all pairwise combinations of firms, I cannot think of a more efficient approach not using joinby. Efficiency concerns aside, you can create dummies for the rolling windows and place each window into a distinct frame. Thereafter, contract firm and market within each window and then calculate your overlap variable. Finally, link the frames with your dataset (each window represents a year) and get the overlap variables. Because you are implementing the same code within each frame, you can write a loop for this. Start with

    Code:
    qui sum year
    forval i= `r(min)'/`r(max)'{
       gen window`i'=inrange(year, `i'-5, `i'-1)
    }

    EDIT: If you create a group variable identifying windows, only 1 frame is needed. You still contract firm and market within group and then implement your code where you use group in place of year in your original code. This can be achieved by appending the windows and numbering them.
    Last edited by Andrew Musau; 24 May 2020, 02:47.

    Comment


    • #3
      Dear Andrew,


      Thank you so much for your insightful advice!

      Comment

      Working...
      X