Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Dear Clyde Schechter, no worries, I totally understand there is too many conditions behind the calculation. Can I ask how were you planning to count as 1 patent (obs 1 and 2) in the table (without double accouting)? Maybe that would be a good starting point for me to see if I can accomodate all these conditioning for citing and cited firms.
    Thanks anyway for your patience and time!

    Comment


    • #17
      This is what I was thinking about:
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input long citing_appln_id int cited_appln_id long(citing_firm_id1 cited_firm_id1) int citing_year byte ccc
      332431744 1226  697362 140411 2012 19
      332431744 1226 1904196 140411 2012 20
      338133576 1226    3657 140411 2012 21
      338133577 1226    3657 140411 2012 22
      341003673 1226 1325123 140411 2012 23
      end
      format cit* %20.0f
      
      
      capture program drop one_cited_appln_id_year
      program define one_cited_appln_id_year
          local comparators = source[1]
          joinby cited_appln_id citing_year using `comparators', unmatched(master)
          list obs_no cit* obs_no_U, noobs clean abbrev(12)
          by obs_no citing_appln_id_U, sort: keep if _n == 1 // REMOVE DUPLICATE CITING APPLNS
          list obs_no cit* obs_no_U, noobs clean abbrev(12)
          by obs_no citing_firm_id1_U, sort: keep if _n == 1 // REMOVE DUPLICATED CITING FRMS
          list obs_no cit* obs_no_U, noobs clean abbrev(12)
          by obs_no: egen wanted = total(obs_no_U != obs_no) // COUNT EVERYTHING BUT SELF_MATCH
          drop *_U
          by obs_no: keep if _n == 1
          drop obs_no _merge
          exit
      end
      
      gen long obs_no = _n
      preserve
      keep obs_no cited_appln_id citing_year citing_appln_id citing_firm_id1
      sort cited_appln_id citing_year
      rename (citing_appln_id citing_firm_id1 obs_no) =_U
      tempfile comparators
      save `comparators'
      restore
      
      gen source = "`comparators'"
      runby one_cited_appln_id_year, by(obs_no) verbose
      drop source
      list, noobs clean abbrev(20)
      Note: -runby- is written by Robert Picard and me, and is available from SSC. It is basically like being able to put a -by- prefix around a block of commands instead of looping over levels of a variable. Looping over levels of a variable is very time consuming because of all the -if-s that have to be evaluated. This eliminates the -ifs- because it replaces the loop with a different iterative structure wherein only one -by- group at a time is in active memory.

      I'll be glad if this at least points you in a productive direction. If you do manage to get something working on a test subset of your data, when it comes time for production runs, remove the -verbose- option from the -runby- command and replace it with -status-. That way you won't get all the intermediate output on the screen, and you will get a periodic update of how far along the process has gone and an estimate of the time remaining to completion. Good luck!
      Last edited by Clyde Schechter; 22 Dec 2022, 20:21.

      Comment


      • #18
        Dear Clyde Schechter, thanks a lot! You are very kind. At the end I think that for me was easier to have a single line for each patent, but I agree that such structure complicates the analysis. I will take this as the starting point and start looking at what is done by these code lines. There are a lot of stuff in Stata that I stil dont know hehehe. Thanks again!!!
        Last edited by Doris Rivera; 23 Dec 2022, 02:12.

        Comment

        Working...
        X