Hello,
I am trying to match compustat data with patent data by following this guideline: http://users.nber.org/~jbessen/matchdoc.pdf
Anyone did this before?
Currently I have a file with financial data sorted by gvkey and year.
Furthermore, I have downloaded the pdpcohdr and dynass files from the site: https://sites.google.com/site/patent...Home/downloads
And I downloaded the patent data file from http://elsa.berkeley.edu/pub/users/bhhall/NBER06.html.
The matchdoc.pdf file contains STATA code to merge the two data sets.
Two questions:
1) Looking at the patent data file above, I have multiple pdpasses and IPCs for each patent. For example:
year patent icl pdpass
1974 3930732 G01B 1500 10030734
1974 3930732 G01B 900 10030734
I want to create the variable npat, which containts number of patents for each pdpass-year. In the example, it should count as 1 patent for pdpass-year 10030734-1974. How do I create this in stata?
2) The stata example code says:
* now find the appropriate gvkey to assign the patents
gen gvkey=.
forvalue i=1/5 {
replace gvkey = gvkey`i' if gvkey`i'~=. & year>=begyr`i' &
year<=endyr`i'
}
When I perform this command I get the error code
invalid syntax
r(198);
I have the variables gvkey1 to gvkey5, begyr1 to begyr5, endyr1 to endyr5 (all from dynass file) and renamed appyear to year.
What could be wrong here?
Thanks in advance!
Best,
MCG
I am trying to match compustat data with patent data by following this guideline: http://users.nber.org/~jbessen/matchdoc.pdf
Anyone did this before?
Currently I have a file with financial data sorted by gvkey and year.
Furthermore, I have downloaded the pdpcohdr and dynass files from the site: https://sites.google.com/site/patent...Home/downloads
And I downloaded the patent data file from http://elsa.berkeley.edu/pub/users/bhhall/NBER06.html.
The matchdoc.pdf file contains STATA code to merge the two data sets.
Two questions:
1) Looking at the patent data file above, I have multiple pdpasses and IPCs for each patent. For example:
year patent icl pdpass
1974 3930732 G01B 1500 10030734
1974 3930732 G01B 900 10030734
I want to create the variable npat, which containts number of patents for each pdpass-year. In the example, it should count as 1 patent for pdpass-year 10030734-1974. How do I create this in stata?
2) The stata example code says:
* now find the appropriate gvkey to assign the patents
gen gvkey=.
forvalue i=1/5 {
replace gvkey = gvkey`i' if gvkey`i'~=. & year>=begyr`i' &
year<=endyr`i'
}
When I perform this command I get the error code
invalid syntax
r(198);
I have the variables gvkey1 to gvkey5, begyr1 to begyr5, endyr1 to endyr5 (all from dynass file) and renamed appyear to year.
What could be wrong here?
Thanks in advance!
Best,
MCG
Comment