Calculating Pecentage of Entering and Exiting Persons in Groups

Svenja Marie

Join Date: Apr 2017
Posts: 5

Calculating Pecentage of Entering and Exiting Persons in Groups

05 Apr 2017, 06:50

Dear all, I am new to STATA (IC14.2).
I want to calculate a variable, that gives me the percentage of entering people & exiting people of a group.

For example:
I have comapny ABC inc. and for this company I have several years (e.g.2000-2010).
In every year for this company, I have e.g. 6, sometimes 7 executives or group members .

Now I want to compare year 2000 (e.g. 6 team member names) with year 2001 (6 or 7 or 5 members, can change depeneding on comapny ).
I want to check the percentage of new names in 2001 and of names, that are no longer on the team in 2001!
Then I want to repeat this process, comparing years 2001 and 2002, than compare 2002 and 2003 and so forth.
Next same process with company DEF inc. etc.

(My Stata Dataset has over 30.000 lines) and I want to do a command that calculates the number of percentages of exiting and entering persons (new names or new personid's / or missing person names in the following year for each company and each year)

I hope I could explain it good enough

Can anyone please help me?

year	company	compid	personname	personid
2000	ABC inc.	1234	Anderson	24364
2000	ABC inc.	1234	Hemp	22880
2000	ABC inc.	1234	Davis	3879
2000	ABC inc.	1234	Switz	10203
2000	ABC inc.	1234	Cadogan	6
2000	ABC inc.	1234	Ford	913
2000	ABC inc.	1234	Barker	24362
2000	ABC inc.	1234	Sobti	20624
2001	ABC inc.	1234	O'Brien	24363
2001	ABC inc.	1234	Hemp	22880
2001	ABC inc.	1234	Cadogan	6
2001	ABC inc.	1234	Anderson	24364
2001	ABC inc.	1234	Davis	3879
2001	ABC inc.	1234	Sniff	25854
2001	ABC inc.	1234	Switz	10203
2001	ABC inc.	1234	Barker	24362
2001	ABC inc.	1234	Ford	913
2001	ABC inc.	1234	Roscitt	22879
2000	DEF inc.	2184	London	24784
2000	DEF inc.	2184	Miller	6175
2000	DEF inc.	2184	Fenn	6177
2000	DEF inc.	2184	Freeland	23443
2000	DEF inc.	2184	Schoonover	24785
2000	DEF inc.	2184	Lenzmeier	6176
2000	DEF inc.	2184	Keskey	15375
2000	DEF inc.	2184	Walden	21726
2000	DEF inc.	2184	Schulze	126
2001	DEF inc.	2184	Fenn	6177
2001	DEF inc.	2184	London	24784
2001	DEF inc.	2184	Keskey	15375
2001	DEF inc.	2184	Freeland	23443
2001	DEF inc.	2184	Lenzmeier	6176
2001	DEF inc.	2184	Jackson	13283
2001	DEF inc.	2184	Schoonover	24785
2001	DEF inc.	2184	Miller	6175
2001	DEF inc.	2184	Schulze	126

Tags: None

Clyde Schechter

Join Date: Apr 2014
Posts: 30354

05 Apr 2017, 09:33

So I think the following code does what you want:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int year str8 company int compid str10 personname int personid
2000 "ABC inc." 1234 "Anderson"   24364
2000 "ABC inc." 1234 "Hemp"       22880
2000 "ABC inc." 1234 "Davis"       3879
2000 "ABC inc." 1234 "Switz"      10203
2000 "ABC inc." 1234 "Cadogan"        6
2000 "ABC inc." 1234 "Ford"         913
2000 "ABC inc." 1234 "Barker"     24362
2000 "ABC inc." 1234 "Sobti"      20624
2001 "ABC inc." 1234 "O'Brien"    24363
2001 "ABC inc." 1234 "Hemp"       22880
2001 "ABC inc." 1234 "Cadogan"        6
2001 "ABC inc." 1234 "Anderson"   24364
2001 "ABC inc." 1234 "Davis"       3879
2001 "ABC inc." 1234 "Sniff"      25854
2001 "ABC inc." 1234 "Switz"      10203
2001 "ABC inc." 1234 "Barker"     24362
2001 "ABC inc." 1234 "Ford"         913
2001 "ABC inc." 1234 "Roscitt"    22879
2000 "DEF inc." 2184 "London"     24784
2000 "DEF inc." 2184 "Miller"      6175
2000 "DEF inc." 2184 "Fenn"        6177
2000 "DEF inc." 2184 "Freeland"   23443
2000 "DEF inc." 2184 "Schoonover" 24785
2000 "DEF inc." 2184 "Lenzmeier"   6176
2000 "DEF inc." 2184 "Keskey"     15375
2000 "DEF inc." 2184 "Walden"     21726
2000 "DEF inc." 2184 "Schulze"      126
2001 "DEF inc." 2184 "Fenn"        6177
2001 "DEF inc." 2184 "London"     24784
2001 "DEF inc." 2184 "Keskey"     15375
2001 "DEF inc." 2184 "Freeland"   23443
2001 "DEF inc." 2184 "Lenzmeier"   6176
2001 "DEF inc." 2184 "Jackson"    13283
2001 "DEF inc." 2184 "Schoonover" 24785
2001 "DEF inc." 2184 "Miller"      6175
2001 "DEF inc." 2184 "Schulze"      126
end

by compid personid (year), sort: gen byte new = _n == 1
by compid personid (year): gen byte exiting = _n == _N
sort year compid

Now, there is some ambiguity about the variable new in year 2000. Since that is the first year in the data, everybody present in 2000 is marked as new. But you don't really know if they are entering in 2000 or were there before. So you might want to do something like -replace new = . if year == 2000-. Or you might want to work with it the way it is now. It really depends on what your research goals are.

Additionally, in the future please use -dataex- to post example data, as I have done above. While the particular table you set out in your post imported easily enough into Stata, often they can be very difficult to work with. Run -ssc install dataex- to install the -dataex- command and then run -help dataex- to read the simple instructions for using it. By using -dataex- you enable those who want to help you to create a completely faithful replica of your example data in Stata with just a simple copy/paste operation.

Added: The code I showed would not identify the situation where somebody leaves and then returns later. If that situation exists in your data and needs to be dealt with, post back and I can modify the code accordingly.

Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

05 Apr 2017, 09:57

You can lag each observation by one year and merge back with the original data for find out precisely who is in both, who's new, and who left.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int year str8 company int compid str10 personname int personid
2000 "ABC inc." 1234 "Anderson"   24364
2000 "ABC inc." 1234 "Hemp"       22880
2000 "ABC inc." 1234 "Davis"       3879
2000 "ABC inc." 1234 "Switz"      10203
2000 "ABC inc." 1234 "Cadogan"        6
2000 "ABC inc." 1234 "Ford"         913
2000 "ABC inc." 1234 "Barker"     24362
2000 "ABC inc." 1234 "Sobti"      20624
2001 "ABC inc." 1234 "O'Brien"    24363
2001 "ABC inc." 1234 "Hemp"       22880
2001 "ABC inc." 1234 "Cadogan"        6
2001 "ABC inc." 1234 "Anderson"   24364
2001 "ABC inc." 1234 "Davis"       3879
2001 "ABC inc." 1234 "Sniff"      25854
2001 "ABC inc." 1234 "Switz"      10203
2001 "ABC inc." 1234 "Barker"     24362
2001 "ABC inc." 1234 "Ford"         913
2001 "ABC inc." 1234 "Roscitt"    22879
2000 "DEF inc." 2184 "London"     24784
2000 "DEF inc." 2184 "Miller"      6175
2000 "DEF inc." 2184 "Fenn"        6177
2000 "DEF inc." 2184 "Freeland"   23443
2000 "DEF inc." 2184 "Schoonover" 24785
2000 "DEF inc." 2184 "Lenzmeier"   6176
2000 "DEF inc." 2184 "Keskey"     15375
2000 "DEF inc." 2184 "Walden"     21726
2000 "DEF inc." 2184 "Schulze"      126
2001 "DEF inc." 2184 "Fenn"        6177
2001 "DEF inc." 2184 "London"     24784
2001 "DEF inc." 2184 "Keskey"     15375
2001 "DEF inc." 2184 "Freeland"   23443
2001 "DEF inc." 2184 "Lenzmeier"   6176
2001 "DEF inc." 2184 "Jackson"    13283
2001 "DEF inc." 2184 "Schoonover" 24785
2001 "DEF inc." 2184 "Miller"      6175
2001 "DEF inc." 2184 "Schulze"      126
end
save "master.dta", replace

* check that year company personid uniquely identify observations
isid year company personid, sort

* lag data by one year
keep year company personid personname
replace year = year - 1
rename personname Lpersonname
drop if year == year[1]
save "lagged.dta", replace

use "master.dta"
merge 1:1 year company personid using "lagged.dta"
sort year company personid

* calculate totals 
by year company: egen inboth = total(_merge == 3)
by year company: egen previous_only = total(_merge == 2)
by year company: egen current_only = total(_merge == 1)

Comment

Svenja Marie

Join Date: Apr 2017

Posts: 5
#4

06 Apr 2017, 01:38

Dear Clyde, dear Robert,
thank you very much for your quick responses and help!!
I first tried to implement Clyde’s command code by doing the following:
My research starts in the year 2000 – but I do have the team composition data for all companies of interest for the year 1999, too!
So I appended team compositions of the year 1999 for all companies to my main-dataset.
Then I executed the following commands:
use maindataset.dta
append using 1999er_teams

by compid personname (year), sort: gen byte new = _n == 1
by compid personname (year): gen byte exiting = _n == _N
sort comp fiscyear
drop if fiscyear==1999

As I don’t need the year 1999 I just dropped it afterwards 😊

A point that came to my mind about the variables new and exiting was:
New shows a 1 in the respective year the new name appears the first time which is totally fine!

But Exiting shows a 1 in the year the name is there the last time (but in that year the person is still there and might not be exiting.) The 1 should be seen as belonging to the following year, because the name disappears the next year and not the current year where the person is there for the last time.

I am not sure if this might be an issue.
Maybe the variable Exiting should be lagged one year - but how can this be done when there is no name to assign a "1" to in the next year? (Is there a formula that would do this for me?)
If not, I will leave it like that.

So now I almost have what I wanted.
In a last step, I need to calculate a variable called fluctuation or turnover.
I want to calculate it as a percentage of the “1s” summed up over the 2 variables “new & exiting” and divided by the respective team sizes in this group.

year new exiting teamsize sum new&exiting fluctuation

2000 1 0 6 3 0.5

2000 0 0 6 3 0.5

2000 0 0 6 3 0.5

2000 0 0 6 3 0.5

2000 1 0 6 3 0.5

2000 0 1 6 3 0.5

(I installed dataex but I didn't know how to convert this table here, so I hope it is ok for this time)

Again, Thank You both very much for your time and help - I know my questions may be really low-level - but I am so glad, that you help me anyway!
I really appreciate that!
Comment
Svenja Marie

Join Date: Apr 2017

Posts: 5
#5

07 Apr 2017, 00:32

Originally posted by Clyde Schechter View Post

So I think the following code does what you want:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input int year str8 company int compid str10 personname int personid 2000 "ABC inc." 1234 "Anderson" 24364 2000 "ABC inc." 1234 "Hemp" 22880 2000 "ABC inc." 1234 "Davis" 3879 2000 "ABC inc." 1234 "Switz" 10203 2000 "ABC inc." 1234 "Cadogan" 6 2000 "ABC inc." 1234 "Ford" 913 2000 "ABC inc." 1234 "Barker" 24362 2000 "ABC inc." 1234 "Sobti" 20624 2001 "ABC inc." 1234 "O'Brien" 24363 2001 "ABC inc." 1234 "Hemp" 22880 2001 "ABC inc." 1234 "Cadogan" 6 2001 "ABC inc." 1234 "Anderson" 24364 2001 "ABC inc." 1234 "Davis" 3879 2001 "ABC inc." 1234 "Sniff" 25854 2001 "ABC inc." 1234 "Switz" 10203 2001 "ABC inc." 1234 "Barker" 24362 2001 "ABC inc." 1234 "Ford" 913 2001 "ABC inc." 1234 "Roscitt" 22879 2000 "DEF inc." 2184 "London" 24784 2000 "DEF inc." 2184 "Miller" 6175 2000 "DEF inc." 2184 "Fenn" 6177 2000 "DEF inc." 2184 "Freeland" 23443 2000 "DEF inc." 2184 "Schoonover" 24785 2000 "DEF inc." 2184 "Lenzmeier" 6176 2000 "DEF inc." 2184 "Keskey" 15375 2000 "DEF inc." 2184 "Walden" 21726 2000 "DEF inc." 2184 "Schulze" 126 2001 "DEF inc." 2184 "Fenn" 6177 2001 "DEF inc." 2184 "London" 24784 2001 "DEF inc." 2184 "Keskey" 15375 2001 "DEF inc." 2184 "Freeland" 23443 2001 "DEF inc." 2184 "Lenzmeier" 6176 2001 "DEF inc." 2184 "Jackson" 13283 2001 "DEF inc." 2184 "Schoonover" 24785 2001 "DEF inc." 2184 "Miller" 6175 2001 "DEF inc." 2184 "Schulze" 126 end by compid personid (year), sort: gen byte new = _n == 1 by compid personid (year): gen byte exiting = _n == _N sort year compid

Now, there is some ambiguity about the variable new in year 2000. Since that is the first year in the data, everybody present in 2000 is marked as new. But you don't really know if they are entering in 2000 or were there before. So you might want to do something like -replace new = . if year == 2000-. Or you might want to work with it the way it is now. It really depends on what your research goals are.

Additionally, in the future please use -dataex- to post example data, as I have done above. While the particular table you set out in your post imported easily enough into Stata, often they can be very difficult to work with. Run -ssc install dataex- to install the -dataex- command and then run -help dataex- to read the simple instructions for using it. By using -dataex- you enable those who want to help you to create a completely faithful replica of your example data in Stata with just a simple copy/paste operation.

Added: The code I showed would not identify the situation where somebody leaves and then returns later. If that situation exists in your data and needs to be dealt with, post back and I can modify the code accordingly.

Dear Clyde, dear Robert,
thank you very much for your quick responses and help!!
I first tried to implement Clyde’s command code by doing the following:
My research starts in the year 2000 – but I do have the team composition data for all companies of interest for the year 1999, too!
So I appended team compositions of the year 1999 for all companies to my main-dataset.
Then I executed the following commands:
use maindataset.dta
append using 1999er_teams

by compid personname (year), sort: gen byte new = _n == 1
by compid personname (year): gen byte exiting = _n == _N
sort comp fiscyear
drop if fiscyear==1999

As I don’t need the year 1999 I just dropped it afterwards 😊

A point that came to my mind about the variables new and exiting was:
New shows a 1 in the respective year the new name appears the first time which is totally fine!

But Exiting shows a 1 in the year the name is there the last time (but in that year the person is still there and might not be exiting.) The 1 should be seen as belonging to the following year, because the name disappears the next year and not the current year where the person is there for the last time.

I am not sure if this might be an issue.
Maybe the variable Exiting should be lagged one year - but how can this be done when there is no name to assign a "1" to in the next year? (Is there a formula that would do this for me?)
If not, I will leave it like that.

So now I almost have what I wanted.
In a last step, I need to calculate a variable called fluctuation or turnover.
I want to calculate it as a percentage of the “1s” summed up over the 2 variables “new & exiting” and divided by the respective team sizes in this group.

year new exiting teamsize sum new&exiting fluctuation

2000 1 0 6 3 0.5

2000 0 0 6 3 0.5

2000 0 0 6 3 0.5

2000 0 0 6 3 0.5

2000 1 0 6 3 0.5

2000 0 1 6 3 0.5

(I installed dataex but I didn't know how to convert this table here, so I hope it is ok for this time)

Again, Thank You both very much for your time and help - I know my questions may be really low-level - but I am so glad, that you help me anyway!
I really appreciate that!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30354
#6

07 Apr 2017, 08:32

Well, -gen fluctuation = (new + exiting)/teamsize-; but I think you knew that and that isn't what you mean to ask. I guess the question is how to create the team size variable. And I'm not sure how to answer that question based on your data. Is it the total number of distinct personid's within a company across all years? If so:

Code:

by compid personid, sort: gen distinct_persons = 1 if _n == 1 by compid: replace distinct_persons = sum(distinct_persons) by compid: gen teamsize = distinct_persons[_N]
Comment

Svenja Marie

Join Date: Apr 2017
Posts: 5

09 Apr 2017, 01:31

Originally posted by Clyde Schechter View Post

Well, -gen fluctuation = (new + exiting)/teamsize-; but I think you knew that and that isn't what you mean to ask. I guess the question is how to create the team size variable. And I'm not sure how to answer that question based on your data. Is it the total number of distinct personid's within a company across all years? If so:

Code:

by compid personid, sort: gen distinct_persons = 1 if _n == 1
by compid: replace distinct_persons = sum(distinct_persons)
by compid: gen teamsize = distinct_persons[_N]

Dear Clyde,

thank you for your response!
The Teamsize I already calculated with the following formular:

*************************************
* *
* Variable Team Size *
* *
*************************************
by compid fiscyear, sort: generate teamsize= _N

This worked out for me when I checked everything with the browse-window.

What I want to do now is:
For a company and a respective fiscalyear - sum up all exits and entries (exiting+new) and divide this sum by the teamsizevariable.
If you look at the table here you cann see what I mean.
I used your formula: by company fiscyear, sort: gen turnover = (new+exiting)/teamsize and this gave me the variable "turnover WRONG" (just to show you here)
What I want is the variable with an output like in "turnover CORRECT" - summing up all new&exiting for one company in one fiscal year and then divide by the respective teamsize in this year, which is of course the same number in every line of that year (paneldata)

company	fiscyear	execname	new	exiting	teamsize	turnover WRONG	turnover CORRECT
ABC inc.	2000	Mr.A	0	0	8	0	0,375
ABC inc.	2000	Mr. B	0	0	8	0	0,375
ABC inc.	2000	Mr. D	0	0	8	0	0,375
ABC inc.	2000	Mr. E	1	0	8	.125	0,375
ABC inc.	2000	Mr. F	0	0	8	0	0,375
ABC inc.	2000	Mr. C	1	0	8	.125	0,375
ABC inc.	2000	Mr. G	0	1	8	.125	0,375
ABC inc.	2000	Mr. H	0	0	8	0	0,375
ABC inc.	2001	Mr. I	1	0	10	.1	0,7
ABC inc.	2001	Mr.A	0	1	10	.1	0,7
ABC inc.	2001	Mr. H	0	1	10	.1	0,7
ABC inc.	2001	Mr. J	1	0	10	.1	0,7
ABC inc.	2001	Mr. F	0	1	10	.1	0,7
ABC inc.	2001	Mr. E	0	0	10	0	0,7
ABC inc.	2001	Mr. D	0	0	10	0	0,7
ABC inc.	2001	Mr. C	0	0	10	0	0,7
ABC inc.	2001	Mr. K	1	0	10	.1	0,7
ABC inc.	2001	Mr. B	0	1	10	.1	0,7

Thank you so much for helping me out!
Kind regards,
Svenja

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30354
#8

09 Apr 2017, 09:56

Code:

by company fiscyear, sort: egen turnover = total(new+exiting) replace turnover = turnover/teamsize

Do read the manual section on -egen-. It is bristling with useful data management functions that will help you in your work with Stata on a regular basis.
Comment
Svenja Marie

Join Date: Apr 2017

Posts: 5
#9

10 Apr 2017, 02:15

Originally posted by Clyde Schechter View Post

Code:

by company fiscyear, sort: egen turnover = total(new+exiting) replace turnover = turnover/teamsize

Do read the manual section on -egen-. It is bristling with useful data management functions that will help you in your work with Stata on a regular basis.

Thank you very much Clyde!
This worked out perfectly for me and solved my problem!

I will read this manual section.

Have a nice day
Comment
Demet Korkut

Join Date: Aug 2018

Posts: 16
#10

19 Sep 2022, 16:06

Originally posted by Svenja Marie View Post

Thank you very much Clyde!
This worked out perfectly for me and solved my problem!

I will read this manual section.

Have a nice day

Hi, this code perfectly solves the problem with the given data set. But, when we add one more year (say 2002) and some of the exiting people reenter into the team in this fiscal year(2002), there occurs a problem with these people in recording them as "new" this year. How can modify the code with alternative scenario?

Demet Korkut
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30354
#11

19 Sep 2022, 18:43

Hi, this code perfectly solves the problem with the given data set. But, when we add one more year (say 2002) and some of the exiting people reenter into the team in this fiscal year(2002), there occurs a problem with these people in recording them as "new" this year. How can modify the code with alternative scenario?

Please post a new example data set (using -dataex-, of course) that illustrates this problem. And elaborate what your concern is. When some of the exiting people reenter, why would you not count them as "new" for the purposes of calculating turnover? And what would you do with them instead? What if they returned 2 years later? Or 5? Or 10?....
Comment
Demet Korkut

Join Date: Aug 2018

Posts: 16
#12

20 Sep 2022, 07:21

Originally posted by Clyde Schechter View Post

Please post a new example data set (using -dataex-, of course) that illustrates this problem. And elaborate what your concern is. When some of the exiting people reenter, why would you not count them as "new" for the purposes of calculating turnover? And what would you do with them instead? What if they returned 2 years later? Or 5? Or 10?....

I simply modified the data set in this post according my question:
clear
input int year str8 company int compid str10 personname int personid byte(new exiting)
2000 "ABC inc." 1234 "Ford" 913 1 0
2000 "ABC inc." 1234 "Switz" 10203 1 0
2000 "ABC inc." 1234 "Barker" 24362 1 0
2000 "ABC inc." 1234 "Hemp" 22880 1 0
2000 "ABC inc." 1234 "Cadogan" 6 1 0
2000 "ABC inc." 1234 "Anderson" 24364 1 0
2000 "ABC inc." 1234 "Davis" 3879 1 0
2000 "ABC inc." 1234 "Sobti" 20624 1 0
2001 "ABC inc." 1234 "Davis" 3879 0 0
2001 "ABC inc." 1234 "Sniff" 25854 1 1
2001 "ABC inc." 1234 "Roscitt" 22879 1 1
2001 "ABC inc." 1234 "O'Brien" 24363 1 1
2001 "ABC inc." 1234 "Switz" 10203 0 0
2001 "ABC inc." 1234 "Ford" 913 0 0
2001 "ABC inc." 1234 "Barker" 24362 0 1
2001 "ABC inc." 1234 "Hemp" 22880 0 0
2001 "ABC inc." 1234 "Anderson" 24364 0 0
2001 "ABC inc." 1234 "Cadogan" 6 0 1
2002 "ABC inc." 1234 "Anderson" 24364 0 1
2002 "ABC inc." 1234 "Sobti" 20624 0 1
2002 "ABC inc." 1234 "Switz" 10203 0 1
2002 "ABC inc." 1234 "Hemp" 22880 0 1
2002 "ABC inc." 1234 "Ford" 913 0 1
2002 "ABC inc." 1234 "B" 33 1 1
2002 "ABC inc." 1234 "A" 22 1 1
2002 "ABC inc." 1234 "Davis" 3879 0 1
2003 "ABC inc." 1234 "D" 234 1 1
2003 "ABC inc." 1234 "C" 21 1 1
2000 "DEF inc." 2184 "London" 24784 1 0
2000 "DEF inc." 2184 "Miller" 6175 1 0
2000 "DEF inc." 2184 "Freeland" 23443 1 0
2000 "DEF inc." 2184 "Schoonover" 24785 1 0
2000 "DEF inc." 2184 "Schulze" 126 1 0
2000 "DEF inc." 2184 "Fenn" 6177 1 0
2000 "DEF inc." 2184 "Keskey" 15375 1 0
2000 "DEF inc." 2184 "Walden" 21726 1 1
2000 "DEF inc." 2184 "Lenzmeier" 6176 1 0
2001 "DEF inc." 2184 "Keskey" 15375 0 1
2001 "DEF inc." 2184 "London" 24784 0 1
2001 "DEF inc." 2184 "Schulze" 126 0 1
2001 "DEF inc." 2184 "Freeland" 23443 0 0
2001 "DEF inc." 2184 "Jackson" 13283 1 0
2001 "DEF inc." 2184 "Lenzmeier" 6176 0 1
2001 "DEF inc." 2184 "Fenn" 6177 0 1
2001 "DEF inc." 2184 "Schoonover" 24785 0 0
2001 "DEF inc." 2184 "Miller" 6175 0 1
2003 "" 2184 "Schoonover" 24785 0 1
2003 "" 2184 "Jackson" 13283 0 1
2003 "" 2184 "Freeland" 23443 0 1
end
[/CODE](
I run the code you recommended above and obtained "new" and "exiting" variables. Now take the person name "Sobti" with Id "20624". In my dataset covering 2000-2003, he exit in 2000 (because dos not appear in 2001) and reenters in 2002 again. But the observation point in "new" and "exiting" variables do not correspond his situation. For the year 2000, "no exit" appears in this scenario, but actually it is.

Actually, my original dataset which I work on contain firms' exixtence and accounts within several sectors covering 2006-2020. I have already posted my sample dataset here (Statalist) with the title "Entry and exit rates of firms in various years" (or like that). For each year I want to calculate entry and exit rates ((enty+exit)/total number of firms). I wrote in this post because it is completely relevant with my purpose. With code I obtained a result that once a firm enters into the sector, it seems it operates till the last year (2020) even if it exits and reenter in various years between 20006-2020.

Demet
Comment

Demet Korkut

Join Date: Aug 2018
Posts: 16

#13

20 Sep 2022, 07:23

Sorry, I copied wrong in my last post. here is the relevant dataset:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int year str8 company int compid str10 personname int personid byte(new exiting)
2000 "ABC inc." 1234 "Ford"         913 1 0
2000 "ABC inc." 1234 "Switz"      10203 1 0
2000 "ABC inc." 1234 "Barker"     24362 1 0
2000 "ABC inc." 1234 "Hemp"       22880 1 0
2000 "ABC inc." 1234 "Cadogan"        6 1 0
2000 "ABC inc." 1234 "Anderson"   24364 1 0
2000 "ABC inc." 1234 "Davis"       3879 1 0
2000 "ABC inc." 1234 "Sobti"      20624 1 0
2001 "ABC inc." 1234 "Davis"       3879 0 0
2001 "ABC inc." 1234 "Sniff"      25854 1 1
2001 "ABC inc." 1234 "Roscitt"    22879 1 1
2001 "ABC inc." 1234 "O'Brien"    24363 1 1
2001 "ABC inc." 1234 "Switz"      10203 0 0
2001 "ABC inc." 1234 "Ford"         913 0 0
2001 "ABC inc." 1234 "Barker"     24362 0 1
2001 "ABC inc." 1234 "Hemp"       22880 0 0
2001 "ABC inc." 1234 "Anderson"   24364 0 0
2001 "ABC inc." 1234 "Cadogan"        6 0 1
2002 "ABC inc." 1234 "Anderson"   24364 0 1
2002 "ABC inc." 1234 "Sobti"      20624 0 1
2002 "ABC inc." 1234 "Switz"      10203 0 1
2002 "ABC inc." 1234 "Hemp"       22880 0 1
2002 "ABC inc." 1234 "Ford"         913 0 1
2002 "ABC inc." 1234 "B"             33 1 1
2002 "ABC inc." 1234 "A"             22 1 1
2002 "ABC inc." 1234 "Davis"       3879 0 1
2003 "ABC inc." 1234 "D"            234 1 1
2003 "ABC inc." 1234 "C"             21 1 1
2000 "DEF inc." 2184 "London"     24784 1 0
2000 "DEF inc." 2184 "Miller"      6175 1 0
2000 "DEF inc." 2184 "Freeland"   23443 1 0
2000 "DEF inc." 2184 "Schoonover" 24785 1 0
2000 "DEF inc." 2184 "Schulze"      126 1 0
2000 "DEF inc." 2184 "Fenn"        6177 1 0
2000 "DEF inc." 2184 "Keskey"     15375 1 0
2000 "DEF inc." 2184 "Walden"     21726 1 1
2000 "DEF inc." 2184 "Lenzmeier"   6176 1 0
2001 "DEF inc." 2184 "Keskey"     15375 0 1
2001 "DEF inc." 2184 "London"     24784 0 1
2001 "DEF inc." 2184 "Schulze"      126 0 1
2001 "DEF inc." 2184 "Freeland"   23443 0 0
2001 "DEF inc." 2184 "Jackson"    13283 1 0
2001 "DEF inc." 2184 "Lenzmeier"   6176 0 1
2001 "DEF inc." 2184 "Fenn"        6177 0 1
2001 "DEF inc." 2184 "Schoonover" 24785 0 0
2001 "DEF inc." 2184 "Miller"      6175 0 1
2003 ""         2184 "Schoonover" 24785 0 1
2003 ""         2184 "Jackson"    13283 0 1
2003 ""         2184 "Freeland"   23443 0 1
end

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30354
#14

20 Sep 2022, 09:35

So, I take it your question is how to fix the new and exiting variables because they do not currently reflect the actual entry and exit status.

Code:

drop new exiting xtset personid year gen byte new = L1.compid != compid gen byte exiting = F1.compid != compid
Comment
Demet Korkut

Join Date: Aug 2018

Posts: 16
#15

28 Sep 2022, 01:07

Originally posted by Clyde Schechter View Post

So, I take it your question is how to fix the new and exiting variables because they do not currently reflect the actual entry and exit status.

Code:

drop new exiting xtset personid year gen byte new = L1.compid != compid gen byte exiting = F1.compid != compid

Thank you, this code works well.
Comment

Announcement

Calculating Pecentage of Entering and Exiting Persons in Groups

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment