Calculating Cumulative H-index with Stata?

Hyeonjin Cha

Join Date: Feb 2020
Posts: 2

Calculating Cumulative H-index with Stata?

14 Feb 2020, 18:31

Hello statalisters,

I've been trying to trying to calculate the h index for a large dataset consisting of scientists. The h index is defined as the maximum value of h such that the given author/journal has published h papers that have each been cited at least h times. The dataset looks somewhat like this:

authorid	year	articleid	citation	hindex	c_hindex	t_hindex
A	1990	1	7	5	5	15
A	1990	2	5	5	5	15
A	1990	3	13	5	5	15
A	1990	4	12	5	5	15
A	1990	5	17	5	5	15
A	1991	6	11	4	7	15
A	1991	7	9	4	7	15
A	1991	8	19	4	7	15
A	1991	9	15	4	7	15
A	1992	10	14	3	9	15
A	1992	11	4	3	9	15
A	1992	12	3	3	9	15
A	1992	13	7	3	9	15
A	1992	14	5	3	9	15
A	1992	15	4	3	9	15
A	1992	16	11	3	9	15
A	1992	17	17	3	9	15
A	1993	18	15	4		15
A	1993	19	17	4		15
A	1993	20	18	4		15
A	1993	21	11	4		15
A	1994	22	3			15
A	1994	23	15			15
A	1994	24	14			15
A	1994	25	17			15
A	1994	26	13			15
A	1994	27	12			15
A	1994	28	6			15
A	1994	29	15			15
A	1994	30	5			15
B	1990	31	11
B	1991	32	11
B	1991	33	4
B	1991	34	4
B	1991	35	3
B	1992	36	9
B	1992	37	22
B	1992	38	2
B	1992	39	9
B	1992	40	4
B	1992	41	37
B	1992	42	9
B	1992	43	8
B	1992	44	3
B	1993	45	13
B	1993	46	9
B	1993	47	7
B	1993	48	3
B	1993	49	10
B	1993	50	9
B	1994	51	1
B	1994	52	2
B	1994	53	6
B	1994	54	6
B	1994	55	7

With a little bit of help from the stata forum (https://www.stata.com/statalist/arch.../msg00625.html), I could calculate the h-index of each author-year (hindex, column 5) using the following command:

*generate h_index for each year, flow

bysort authorid year : egen temp = rank(-citation), unique
bysort authorid year citation : egen rank = max(temp)
by authorid year : egen hindextemp = max(rank) if citation >= rank
bysort authorid year : egen hindex = max(hindextemp)
drop rank temp hindextemp

What I'm having a hard time with is calculating the cumulative h-index of each author-year (c_hindex, column 6). For instance, author A has 7 articles that have been cited at least in 1991, therefore the cumulative h index for A in 1991 is 7.

Could anybody help me up with the command for the cumulative h-index?

Thank you very much in advance!

Hyeonjin

Last edited by Hyeonjin Cha; 14 Feb 2020, 18:33.

Tags: None

Announcement

Calculating Cumulative H-index with Stata?