Creating a new variables that counts number of patents per firm per year

Jasper van Houten

Join Date: Jun 2022

Posts: 1
#1

Creating a new variables that counts number of patents per firm per year

08 Jun 2022, 04:38

Hi all,

I am currently working on a dataset with different firms (name_std), different years (years_id), their granted patents (patent_id), and with a dummy variable (green_d) that equals 1 when the patent is a green patent.
What I now want is to generate a new variable that counts how many patents that a firm is granted in a given year is a green patent ( green_d = 1). Hence, it needs to count the number of green patents per year per firm.

Anyone that can help me with this, it would be much appreciated!
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35751
#2

08 Jun 2022, 05:03

Could be something like

Code:

egen n_green = total(green_d), by(name years)
1 like
Comment
Giulia Fantini

Join Date: Jul 2023

Posts: 25
#3

19 Dec 2023, 03:42

What if you would like something like the table below. What would it be the command please?

Number of Patents Number of Green Patents Number of NoGreen Patents

Years Total Total Total

1926

1927

1928

1929
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17719
#4

19 Dec 2023, 03:49

Giulia:
see the -table- suite worked out examples in related entry of Stata .pdf manual (release =>17).

Kind regards,
Carlo
(Stata 19.0)
Comment
Giulia Fantini

Join Date: Jul 2023

Posts: 25
#5

20 Dec 2023, 12:13

Thanks Carlo. Unfortunately, I have not found an answer to my question.

I have created this table that shows the Number of green and non-green patents per year
egen not_green = total(scheme_code == 0), by (filing_year)
egen green = total(inrange(scheme_code,1,9)),by (filing_year)
tabdisp filing_year, c(green not_green)

Now, I need to add an extra column that shows how many companies filed these green patents and non-green patents.
To do that I have written:

egen n_firm_green_p = total(green), by(permno filing_year)
tabdisp filing_year, c(green not_green n_firm_green_p)

But it does not give me the correct outcome.

Where am I mistaking?

Can anyone help me, please?

Thanks!
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35751

20 Dec 2023, 14:36

Your code is double counting. For example, green is the total over each year of observations ofscheme that arenot 0. It is thus constant across perm_no. Pushing it through total() again won't subdivide; it just multiplies up by the number of observations with the same total in each group.

I guess you want something quite different, more like

Code:

gen is_green = inrange(scheme, 1, 9) if !missing(scheme) 
egen tag = tag(perm_no filing_year is_green)
egen wanted = total(tag), by(is_green filing_year)

Here's a toy dataset and results to discuss.

Code:


	Code:
	* Example generated by -dataex-. For more info, type help dataex
clear
input float (filing_year perm_no scheme)
1 1 0
1 2 1
1 3 2
1 3 3
1 4 4
2 1 0
2 1 1
2 1 2
2 1 3
2 2 4
end


gen is_green = inrange(scheme, 1, 9) if !missing(scheme) 
egen tag = tag(perm_no filing_year is_green)
egen wanted = total(tag), by(is_green filing_year)

tabdisp filing_year is_green, c(wanted)

----------------------
filing_ye |  is_green 
ar        |    0     1
----------+-----------
        1 |    1     3
        2 |    1     2
----------------------

Comment

Giulia Fantini

Join Date: Jul 2023

Posts: 25
#7

21 Dec 2023, 01:56

Thanks a lot Nick!

Now I have to determine the number of patents number of firms, citations , avrage Xi_nominal,average Xi_real per year (total at the end)

To compute how many patents per company I have per year I just add together by filing_year the total amount under 0 and 1 for is_green (see your example above.

bys permno filing_year: egen n_citations = sum(cites) // I computed how many citation per company I have per year
bys permno filing_year: egen avg_xi_nominal = mean(xi_nominal) // I have computed the avergae per company per year of xi_nominal
bys permno filing_year: egen avg_xi_real = mean(xi_real) // I have computed the avergae per company per year of xi_nominal
sort permno filing_year
order patent_num permno filing_year n_patents cites n_citations xi_nominal avg_xi_nominal xi_real avg_xi_real cpc scheme_code cites filing_date issue_date issue_year
tabdisp filing_year, c( n_citations avg_xi_nominal avg_xi_real)

Is this correct?

Thanks
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35751
#8

21 Dec 2023, 04:02

I suggest the method used in #6. Work with a tiny subset of your dataset for which you can verify independently that the results are correct.
Comment

	Number of Patents	Number of Green Patents	Number of NoGreen Patents
Years	Total	Total	Total
1926
1927
1928
1929

Announcement

Creating a new variables that counts number of patents per firm per year

Comment

Comment

Comment

Comment

Comment

Comment

Comment