Hello,
I am working with NBER Patents Database, and I am trying to construct so-called Citation Index (CI) following Hirschey, M., Richardson J.V. (2004) Are scientific indicators of patent quality useful to investors? CI is defined as "the number of citations generated in the current year by patents granted to the company during the most recent 3-year period", including current year. I have a balanced panel, where one panel in general looks like this:
where id is a unique identificator of the firm, patent_no is a number of patent, grant_year is the year, in which Patent Office approved firm's application for the patent, and citations is the number of citations, received by the patent in year i.
So I need to generate some variable CI, which in this particular case would look like (if we assume that firm 10300 obtained only 2 patents in its lifetime):
E.g., for year 1987 CI = 2 (because patent no 4146300 has been granted in 1985, so less or equal than 3 years ago) + 4 (because patent no 4146301 has also been granted within last 3 years) = 1.
I can't figure out how to do this, so any help would be much appreciated. Actual dataset is large, with year=1975..2006, and number of patent_no-year observations almost 100 mil.
Thank you!
I am working with NBER Patents Database, and I am trying to construct so-called Citation Index (CI) following Hirschey, M., Richardson J.V. (2004) Are scientific indicators of patent quality useful to investors? CI is defined as "the number of citations generated in the current year by patents granted to the company during the most recent 3-year period", including current year. I have a balanced panel, where one panel in general looks like this:
id | patent_no | grant_year | year | citations |
10300 | 4146300 | 1985 | 1985 | 0 |
10300 | 4146300 | 1985 | 1986 | 2 |
10300 | 4146300 | 1985 | 1987 | 2 |
10300 | 4146300 | 1985 | 1988 | 7 |
10300 | 4146300 | 1985 | 1989 | 5 |
10300 | 4146301 | 1986 | 1985 | 0 |
10300 | 4146301 | 1986 | 1986 | 0 |
10300 | 4146301 | 1986 | 1987 | 4 |
10300 | 4146301 | 1986 | 1988 | 1 |
10300 | 4146301 | 1986 | 1989 | 3 |
where id is a unique identificator of the firm, patent_no is a number of patent, grant_year is the year, in which Patent Office approved firm's application for the patent, and citations is the number of citations, received by the patent in year i.
So I need to generate some variable CI, which in this particular case would look like (if we assume that firm 10300 obtained only 2 patents in its lifetime):
id | year | CI |
10300 | 1985 | 0 |
10300 | 1986 | 2 |
10300 | 1987 | 6 |
10300 | 1988 | 1 |
10300 | 1989 | 3 |
E.g., for year 1987 CI = 2 (because patent no 4146300 has been granted in 1985, so less or equal than 3 years ago) + 4 (because patent no 4146301 has also been granted within last 3 years) = 1.
I can't figure out how to do this, so any help would be much appreciated. Actual dataset is large, with year=1975..2006, and number of patent_no-year observations almost 100 mil.
Thank you!
Comment