Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging us patentsview with IPC codes, multiple ipc codes for one patent_id

    I want to calculate the radicalness of patents, so I am working with the backward citation. To calculate the radicalness of patents I am using the OECD radicalness index, which is shown in the figure below. The radicalness of a patent is calculated by dividing the number of IPC codes by the number of all citations.
    Click image for larger version

Name:	radicalness.jpg
Views:	1
Size:	7.4 KB
ID:	1610024


    I am working with the IPC classification data of patentsview and I am merging this dataset with the US patent citation data of patentsview to calculate the radicalness. The problem is that the IPC dataset gives multiple IPC codes for one patent_id. Therefore I can not merge the dataset with the us patent citation dataset. I need to have one IPC code for every patent, How can I achieve this? Attached is a part of my dataset of the IPC codes. I already did drop some variables which are not needed in the analysis. When I can merge the dataset with the US patent citation, I can calculate the radicalness. see picture below for a part of the data, which shows the problem.

    Thanks in advance.

    Click image for larger version

Name:	WhatsApp Image 2021-05-14 at 17.51.40.jpeg
Views:	1
Size:	141.3 KB
ID:	1610025

  • #2
    Could you expand this question by telling a bit on what does i) the other data set look like, and ii) if you are able to merge them successfully, what would the resultant data look like?

    Comment


    • #3
      The other dataset is the US patentsview database containing citation data. so it includes by what patent a patent is cited. If i am able to merge them i want to calculate the radicalness by counting the times a patent (citation_id) got cited by another patent (patent_id). the IPC codes are used to calculate how many times the patent is cited by different IPC codes (example, the patent (citation_id) is cited by 20 patents from 10 different IPC codes, giving a radicalness of 0,5)
      Click image for larger version

Name:	stata us citation data.PNG
Views:	1
Size:	50.4 KB
ID:	1610126

      Comment


      • #4
        For the second data shown in #3, if you run:

        Code:
        duplicates report patent_id
        Did you see any duplicated copies? If not I believe a m:1 merge may work.

        Comment

        Working...
        X