Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reconstructing data to identify firms patent portfolios

    Hello,

    Before I proceed, I will mention that I am using Stata version 16.1.

    I am working with patent data. As shown in my supplied code (below), I have the following variables:

    firm_id - unique firm id
    patent_id - unique id for each patent
    seller_name - sellers name
    seller_id - seller firms unique id
    trans_year - year of patent transaction
    pat_transacted - dummy variable that equals 1 if that patent was purchased from a seller and 0 if the patent was produced in-house.
    appln_year - year patent was applied for


    The data spans from the year 2000-2018. My goal is to reconstruct the data so that each unique firm (firm_id) has a corresponding year variable from 2000-2018 which identifies how many patents are within that firms portfolio in each year from 2000-2018.

    However, there are certain stipulations. If a patent was produced in-house (i.e. pat_transacted=0) then the patent entered the firms portfolio in the given appln_year. However, if the patent was purchased (i.e. pat_transacted=1) then the patent entered the firms portfolio in the given trans_year, not the given appln_year.

    For example,

    The firm with id= A80470 has 4 patents in total. 3 of these patents became part of the firms portfolio in 2007 (i.e patent id A2466Y, A9719Y and A5511Y)
    and 1 patent was purchased and entered the firms portfolio in 2015 (i.e. patent id A8007Y).

    Accordingly, my goal would be to have a corresponding year variable for each unique firm id for years 2000-2018. Hence, using the previous example for firm with id= A80470. The corresponding year variable would = 0 for years 2000-2006, 3 for years 2007-2014 and 4 for 2015-2018.


    I have tried using the following code

    Code:
    forvalues x = 2000/2018{
    gen y`x' = (appln_year <= `x') 
    }
    and then reshaping the data from wide to long. It would appear that this approach is on the right tracks. However, it does not take into account the stipulation regarding trans_year and it also fails to additionally carry forward several patents into the following year.


    In brief summary, my goal is to have a corresponding year variable from 2000-2018 for each unique firm that identifies the number of patents within a firms portfolio accounting for the two stipulations previously mentioned (i.e. a patent with pat_transacted=0 enters a portfolio in a given appln_year while a patent with pat_transacted=1 enters a portfolio within a given trans_year).

    I appreciate any insight or advice you may have to resolve this issue.

    Thanks for your time.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str6(firm_id patent_id) str16 seller_name str4 seller_bvdid int trans_year byte pat_transacted int appln_year
    "A62271" "A6862A" ""                 ""        . 0 2008
    "A62271" "A3880A" ""                 ""        . 0 2008
    "A15496" "A9134Y" ""                 ""        . 0 2000
    "A00077" "A1923Y" ""                 ""        . 0 2005
    "A00077" "A2120Y" ""                 ""        . 0 2006
    "A00366" "A9173Y" ""                 ""        . 0 2003
    "A05326" "A3691Y" ""                 ""        . 0 2002
    "A06980" "A2419A" ""                 ""        . 0 2011
    "A09698" "A4999C" ""                 ""        . 0 2005
    "A10862" "A5925Y" ""                 ""        . 0 2002
    "A10862" "A7848Y" ""                 ""        . 0 2001
    "A10862" "A5950Y" ""                 ""        . 0 2002
    "A10862" "A0610Y" ""                 ""        . 0 2001
    "A11584" "A4661Y" ""                 ""        . 0 2003
    "A11584" "A8262Y" ""                 ""        . 0 2003
    "A13270" "A2588Y" ""                 ""        . 0 2006
    "A15342" "A3600Y" ""                 ""        . 0 2006
    "A23041" "A6835Y" ""                 ""        . 0 2002
    "A23041" "A8213Y" ""                 ""        . 0 2000
    "A40753" "A8674Y" ""                 ""        . 0 2004
    "A42975" "A0503A" ""                 ""        . 0 2017
    "A43209" "A2069Y" ""                 ""        . 0 2006
    "A46788" "A8851A" ""                 ""        . 0 2015
    "A46788" "A9393A" ""                 ""        . 0 2017
    "A46788" "A2378A" ""                 ""        . 0 2015
    "A47224" "A9217T" ""                 ""        . 0 2003
    "A47224" "A287A1" "Tool Company"     "A818" 2009 1 2005
    "A47224" "A287B1" "Tool Company"     ""     2009 1 2005
    "A47224" "A081B1" "Tool Company"     "A818" 2009 1 2004
    "A47224" "A081A1" "Tool Company"     "A818" 2009 1 2004
    "A47224" "A292A1" "Tool Company"     "A818" 2009 1 2003
    "A47224" "A766B1" "Tool Company"     ""     2009 1 2003
    "A47224" "A766A1" "Tool Company"     "A818" 2009 1 2003
    "A57348" "A4732Y" ""                 ""        . 0 2005
    "A58005" "A8268Y" ""                 ""        . 0 2006
    "A67226" "A3975A" ""                 ""        . 0 2009
    "A68081" "A6427A" ""                 ""        . 0 2004
    "A68081" "A9893A" ""                 ""        . 0 2003
    "A71234" "A9992B" ""                 ""        . 0 2010
    "A71234" "A9992A" ""                 ""        . 0 2010
    "A71973" "A6002A" ""                 ""        . 0 2000
    "A71973" "A1277Y" ""                 ""        . 0 2002
    "A71973" "A5035Y" ""                 ""        . 0 2000
    "A72446" "A0998A" "Investment corp"  "A388" 2008 1 2007
    "A72446" "A2833A" "Investment corp"  "A388" 2008 1 2007
    "A72446" "A2839C" "Investment corp"  "A5PC" 2008 1 2007
    "A72446" "A1872C" "Investment corp"  "A371" 2008 1 2006
    "A72446" "A6334A" "Investment corp"  "A371" 2008 1 2006
    "A72446" "A0054B" "Investment corp"  "A395" 2008 1 2006
    "A72446" "A7416A" "Investment corp"  "A5PC" 2008 1 2007
    "A72446" "A0998B" "Investment corp"  "A388" 2008 1 2007
    "A72446" "A2832A" "Investment corp"  "A388" 2008 1 2007
    "A72446" "A0054A" "Investment corp"  ""     2008 1 2006
    "A80439" "A6272C" ""                 ""        . 0 2004
    "A80470" "A9719Y" ""                 ""        . 0 2007
    "A80470" "A8007Y" "John Smith"       ""     2015 1 2009
    "A80470" "A5511Y" ""                 ""        . 0 2007
    "A80470" "A2466Y" ""                 ""        . 0 2007
    "A80872" "A6535A" ""                 ""        . 0 2017
    "A83620" "A4430Y" ""                 ""        . 0 2007
    "A84637" "A4736Y" ""                 ""        . 0 2001
    "A88984" "A3525A" ""                 ""        . 0 2003
    "A03537" "A4511U" ""                 ""        . 0 2015
    "A03537" "A1572U" ""                 ""        . 0 2014
    "A04324" "A5008Y" ""                 ""        . 0 2006
    "A04324" "A2552Y" ""                 ""        . 0 2008
    "A05154" "A6500Y" ""                 ""        . 0 2001
    "A12538" "A2835Y" ""                 ""        . 0 2005
    "A20306" "A5254A" ""                 ""        . 0 2012
    "A20306" "A5254B" ""                 ""        . 0 2012
    "A22056" "A0856A" ""                 ""        . 0 2007
    "A22056" "A1147A" ""                 ""        . 0 2007
    "A22908" "A5703Y" ""                 ""        . 0 2007
    "A22908" "A7055Y" ""                 ""        . 0 2008
    "A22908" "A7054Y" ""                 ""        . 0 2008
    "A23740" "A2262U" ""                 ""        . 0 2011
    "A37631" "A0023A" "Electric company" "A096" 2014 1 2012
    "A37631" "A6633A" "Electric company" "A096" 2012 1 2011
    "A37631" "A6623A" "Electric company" "A096" 2014 1 2013
    "A37631" "A9452B" "Electric company" "A096" 2015 1 2012
    "A37631" "A2640U" "Electric company" "A096" 2015 1 2013
    "A37631" "A7589U" "Electric company" "A096" 2015 1 2012
    "A37631" "A0452C" "Electric company" "A678" 2012 1 2006
    "A37631" "A1958A" "Electric company" "A096" 2015 1 2012
    "A37631" "A8010C" "Electric company" "A678" 2012 1 2005
    "A37631" "A8120A" "Electric company" "A678" 2012 1 2006
    "A37631" "A6623B" "Electric company" ""     2014 1 2013
    "A37631" "A9452A" "Electric company" "A096" 2015 1 2012
    "A37631" "A1958B" "Electric company" "A096" 2015 1 2012
    "A37631" "A0023B" "Electric company" ""     2014 1 2012
    "A37631" "A3661A" "Electric company" "A678" 2012 1 2005
    "A37631" "A7227A" "Electric company" "A096" 2012 1 2011
    end

  • #2
    Code:
    isid firm_id patent_id, sort
    gen acquisition_year = cond(pat_transacted, trans_year, appln_year)
    gen byte patents_held = 1
    collapse (sum) patents_held , by(firm_id acquisition_year)
    fillin firm_id acquisition_year
    by firm_id (acquisition_year), sort: replace patents_held = sum(patents_held)
    drop _fillin
    Note: The resulting data set does not contain any observations for year 2018 because the original example data doesn't either. If the real data set contains some 2018 observations, then there will be 2018 observations (for all firms) in the resulting data set. If not, the 2018 observations, were they to be created, would just be repetitions of the 2017 ones, so it seems rather pointless, but if there is a point to it that I'm missing, you can just -expand- the 2017 observations by a factor of 2 and change the year to 2018 in the duplicated one.

    Comment


    • #3
      Thank you very much Clyde, The code works perfectly and my issue has now been resolved.
      With regards to your note, you are correct. The example data I provided did not contain any observations for the year 2018. However, as you said, when I ran this code through the real data set which contains 2018 observations, then there was 2018 observations included in the resulting data set.

      Greatly appreciated.

      Comment


      • #4
        Hello Clyde,

        Just as a further development upon this query. I was wondering if you would know how to take into account the expiration year of a patent. As indicated in the sample code below... if we introduce the year a patent expires.. would you know how to reflect this in the "patents held" variable.

        As a specific example, the firm with id A62271 would have 0 patents for years 2000-2007.. and then 2 patents held for 2008-2015 but then these 2 patents expire in 2015 and therefore this firm would have 0 patents held in 2016 and 2017.



        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str6(firm_id patent_id) str16 seller_name str4 seller_bvdid int trans_year byte pat_transacted int appln_year float expiry_year
        "A62271" "A6862A" ""                 ""        . 0 2008 2015
        "A62271" "A3880A" ""                 ""        . 0 2008 2015
        "A15496" "A9134Y" ""                 ""        . 0 2000 2007
        "A00077" "A1923Y" ""                 ""        . 0 2005 2012
        "A00077" "A2120Y" ""                 ""        . 0 2006 2013
        "A00366" "A9173Y" ""                 ""        . 0 2003 2010
        "A05326" "A3691Y" ""                 ""        . 0 2002 2009
        "A06980" "A2419A" ""                 ""        . 0 2011 2018
        "A09698" "A4999C" ""                 ""        . 0 2005 2012
        "A10862" "A5925Y" ""                 ""        . 0 2002 2009
        "A10862" "A7848Y" ""                 ""        . 0 2001 2008
        "A10862" "A5950Y" ""                 ""        . 0 2002 2009
        "A10862" "A0610Y" ""                 ""        . 0 2001 2008
        "A11584" "A4661Y" ""                 ""        . 0 2003 2010
        "A11584" "A8262Y" ""                 ""        . 0 2003 2010
        "A13270" "A2588Y" ""                 ""        . 0 2006 2013
        "A15342" "A3600Y" ""                 ""        . 0 2006 2013
        "A23041" "A6835Y" ""                 ""        . 0 2002 2009
        "A23041" "A8213Y" ""                 ""        . 0 2000 2007
        "A40753" "A8674Y" ""                 ""        . 0 2004 2011
        "A42975" "A0503A" ""                 ""        . 0 2017 2024
        "A43209" "A2069Y" ""                 ""        . 0 2006 2013
        "A46788" "A8851A" ""                 ""        . 0 2015 2022
        "A46788" "A9393A" ""                 ""        . 0 2017 2024
        "A46788" "A2378A" ""                 ""        . 0 2015 2022
        "A47224" "A9217T" ""                 ""        . 0 2003 2010
        "A47224" "A287A1" "Tool Company"     "A818" 2009 1 2005 2012
        "A47224" "A287B1" "Tool Company"     ""     2009 1 2005 2012
        "A47224" "A081B1" "Tool Company"     "A818" 2009 1 2004 2011
        "A47224" "A081A1" "Tool Company"     "A818" 2009 1 2004 2011
        "A47224" "A292A1" "Tool Company"     "A818" 2009 1 2003 2010
        "A47224" "A766B1" "Tool Company"     ""     2009 1 2003 2010
        "A47224" "A766A1" "Tool Company"     "A818" 2009 1 2003 2010
        "A57348" "A4732Y" ""                 ""        . 0 2005 2012
        "A58005" "A8268Y" ""                 ""        . 0 2006 2013
        "A67226" "A3975A" ""                 ""        . 0 2009 2016
        "A68081" "A6427A" ""                 ""        . 0 2004 2011
        "A68081" "A9893A" ""                 ""        . 0 2003 2010
        "A71234" "A9992B" ""                 ""        . 0 2010 2017
        "A71234" "A9992A" ""                 ""        . 0 2010 2017
        "A71973" "A6002A" ""                 ""        . 0 2000 2007
        "A71973" "A1277Y" ""                 ""        . 0 2002 2009
        "A71973" "A5035Y" ""                 ""        . 0 2000 2007
        "A72446" "A0998A" "Investment corp"  "A388" 2008 1 2007 2014
        "A72446" "A2833A" "Investment corp"  "A388" 2008 1 2007 2014
        "A72446" "A2839C" "Investment corp"  "A5PC" 2008 1 2007 2014
        "A72446" "A1872C" "Investment corp"  "A371" 2008 1 2006 2013
        "A72446" "A6334A" "Investment corp"  "A371" 2008 1 2006 2013
        "A72446" "A0054B" "Investment corp"  "A395" 2008 1 2006 2013
        "A72446" "A7416A" "Investment corp"  "A5PC" 2008 1 2007 2014
        "A72446" "A0998B" "Investment corp"  "A388" 2008 1 2007 2014
        "A72446" "A2832A" "Investment corp"  "A388" 2008 1 2007 2014
        "A72446" "A0054A" "Investment corp"  ""     2008 1 2006 2013
        "A80439" "A6272C" ""                 ""        . 0 2004 2011
        "A80470" "A9719Y" ""                 ""        . 0 2007 2014
        "A80470" "A8007Y" "John Smith"       ""     2015 1 2009 2016
        "A80470" "A5511Y" ""                 ""        . 0 2007 2014
        "A80470" "A2466Y" ""                 ""        . 0 2007 2014
        "A80872" "A6535A" ""                 ""        . 0 2017 2024
        "A83620" "A4430Y" ""                 ""        . 0 2007 2014
        "A84637" "A4736Y" ""                 ""        . 0 2001 2008
        "A88984" "A3525A" ""                 ""        . 0 2003 2010
        "A03537" "A4511U" ""                 ""        . 0 2015 2022
        "A03537" "A1572U" ""                 ""        . 0 2014 2021
        "A04324" "A5008Y" ""                 ""        . 0 2006 2013
        "A04324" "A2552Y" ""                 ""        . 0 2008 2015
        "A05154" "A6500Y" ""                 ""        . 0 2001 2008
        "A12538" "A2835Y" ""                 ""        . 0 2005 2012
        "A20306" "A5254A" ""                 ""        . 0 2012 2019
        "A20306" "A5254B" ""                 ""        . 0 2012 2019
        "A22056" "A0856A" ""                 ""        . 0 2007 2014
        "A22056" "A1147A" ""                 ""        . 0 2007 2014
        "A22908" "A5703Y" ""                 ""        . 0 2007 2014
        "A22908" "A7055Y" ""                 ""        . 0 2008 2015
        "A22908" "A7054Y" ""                 ""        . 0 2008 2015
        "A23740" "A2262U" ""                 ""        . 0 2011 2018
        "A37631" "A0023A" "Electric company" "A096" 2014 1 2012 2019
        "A37631" "A6633A" "Electric company" "A096" 2012 1 2011 2018
        "A37631" "A6623A" "Electric company" "A096" 2014 1 2013 2020
        "A37631" "A9452B" "Electric company" "A096" 2015 1 2012 2019
        "A37631" "A2640U" "Electric company" "A096" 2015 1 2013 2020
        "A37631" "A7589U" "Electric company" "A096" 2015 1 2012 2019
        "A37631" "A0452C" "Electric company" "A678" 2012 1 2006 2013
        "A37631" "A1958A" "Electric company" "A096" 2015 1 2012 2019
        "A37631" "A8010C" "Electric company" "A678" 2012 1 2005 2012
        "A37631" "A8120A" "Electric company" "A678" 2012 1 2006 2013
        "A37631" "A6623B" "Electric company" ""     2014 1 2013 2020
        "A37631" "A9452A" "Electric company" "A096" 2015 1 2012 2019
        "A37631" "A1958B" "Electric company" "A096" 2015 1 2012 2019
        "A37631" "A0023B" "Electric company" ""     2014 1 2012 2019
        "A37631" "A3661A" "Electric company" "A678" 2012 1 2005 2012
        "A37631" "A7227A" "Electric company" "A096" 2012 1 2011 2018
        end

        Comment


        • #5
          From a Stata code perspective, that's actually rather a different matter.

          Code:
          isid firm_id patent_id, sort
          gen acquisition_year = cond(pat_transacted, trans_year, appln_year)
          replace expiry_year = expiry_year + 1
          keep firm_id patent_id acquisition_year expiry_year
          reshape long @year, i(firm_id patent_id) j(event) string
          fillin firm_id year
          drop _fillin
          by firm_id (year event), sort: gen patents_hold ///
              = sum((event == "acquisition_") - (event == "expiry_"))
          by firm_id year (event), sort: keep if _n == _N
          drop event

          Comment

          Working...
          X