Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating Pecentage of Entering and Exiting Persons in Groups

    Dear all, I am new to STATA (IC14.2).
    I want to calculate a variable, that gives me the percentage of entering people & exiting people of a group.

    For example:
    I have comapny ABC inc. and for this company I have several years (e.g.2000-2010).
    In every year for this company, I have e.g. 6, sometimes 7 executives or group members .

    Now I want to compare year 2000 (e.g. 6 team member names) with year 2001 (6 or 7 or 5 members, can change depeneding on comapny ).
    I want to check the percentage of new names in 2001 and of names, that are no longer on the team in 2001!
    Then I want to repeat this process, comparing years 2001 and 2002, than compare 2002 and 2003 and so forth.
    Next same process with company DEF inc. etc.

    (My Stata Dataset has over 30.000 lines) and I want to do a command that calculates the number of percentages of exiting and entering persons (new names or new personid's / or missing person names in the following year for each company and each year)

    I hope I could explain it good enough
    Can anyone please help me?
    year company compid personname personid
    2000 ABC inc. 1234 Anderson 24364
    2000 ABC inc. 1234 Hemp 22880
    2000 ABC inc. 1234 Davis 3879
    2000 ABC inc. 1234 Switz 10203
    2000 ABC inc. 1234 Cadogan 6
    2000 ABC inc. 1234 Ford 913
    2000 ABC inc. 1234 Barker 24362
    2000 ABC inc. 1234 Sobti 20624
    2001 ABC inc. 1234 O'Brien 24363
    2001 ABC inc. 1234 Hemp 22880
    2001 ABC inc. 1234 Cadogan 6
    2001 ABC inc. 1234 Anderson 24364
    2001 ABC inc. 1234 Davis 3879
    2001 ABC inc. 1234 Sniff 25854
    2001 ABC inc. 1234 Switz 10203
    2001 ABC inc. 1234 Barker 24362
    2001 ABC inc. 1234 Ford 913
    2001 ABC inc. 1234 Roscitt 22879
    2000 DEF inc. 2184 London 24784
    2000 DEF inc. 2184 Miller 6175
    2000 DEF inc. 2184 Fenn 6177
    2000 DEF inc. 2184 Freeland 23443
    2000 DEF inc. 2184 Schoonover 24785
    2000 DEF inc. 2184 Lenzmeier 6176
    2000 DEF inc. 2184 Keskey 15375
    2000 DEF inc. 2184 Walden 21726
    2000 DEF inc. 2184 Schulze 126
    2001 DEF inc. 2184 Fenn 6177
    2001 DEF inc. 2184 London 24784
    2001 DEF inc. 2184 Keskey 15375
    2001 DEF inc. 2184 Freeland 23443
    2001 DEF inc. 2184 Lenzmeier 6176
    2001 DEF inc. 2184 Jackson 13283
    2001 DEF inc. 2184 Schoonover 24785
    2001 DEF inc. 2184 Miller 6175
    2001 DEF inc. 2184 Schulze 126

  • #2
    So I think the following code does what you want:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int year str8 company int compid str10 personname int personid
    2000 "ABC inc." 1234 "Anderson"   24364
    2000 "ABC inc." 1234 "Hemp"       22880
    2000 "ABC inc." 1234 "Davis"       3879
    2000 "ABC inc." 1234 "Switz"      10203
    2000 "ABC inc." 1234 "Cadogan"        6
    2000 "ABC inc." 1234 "Ford"         913
    2000 "ABC inc." 1234 "Barker"     24362
    2000 "ABC inc." 1234 "Sobti"      20624
    2001 "ABC inc." 1234 "O'Brien"    24363
    2001 "ABC inc." 1234 "Hemp"       22880
    2001 "ABC inc." 1234 "Cadogan"        6
    2001 "ABC inc." 1234 "Anderson"   24364
    2001 "ABC inc." 1234 "Davis"       3879
    2001 "ABC inc." 1234 "Sniff"      25854
    2001 "ABC inc." 1234 "Switz"      10203
    2001 "ABC inc." 1234 "Barker"     24362
    2001 "ABC inc." 1234 "Ford"         913
    2001 "ABC inc." 1234 "Roscitt"    22879
    2000 "DEF inc." 2184 "London"     24784
    2000 "DEF inc." 2184 "Miller"      6175
    2000 "DEF inc." 2184 "Fenn"        6177
    2000 "DEF inc." 2184 "Freeland"   23443
    2000 "DEF inc." 2184 "Schoonover" 24785
    2000 "DEF inc." 2184 "Lenzmeier"   6176
    2000 "DEF inc." 2184 "Keskey"     15375
    2000 "DEF inc." 2184 "Walden"     21726
    2000 "DEF inc." 2184 "Schulze"      126
    2001 "DEF inc." 2184 "Fenn"        6177
    2001 "DEF inc." 2184 "London"     24784
    2001 "DEF inc." 2184 "Keskey"     15375
    2001 "DEF inc." 2184 "Freeland"   23443
    2001 "DEF inc." 2184 "Lenzmeier"   6176
    2001 "DEF inc." 2184 "Jackson"    13283
    2001 "DEF inc." 2184 "Schoonover" 24785
    2001 "DEF inc." 2184 "Miller"      6175
    2001 "DEF inc." 2184 "Schulze"      126
    end
    
    by compid personid (year), sort: gen byte new = _n == 1
    by compid personid (year): gen byte exiting = _n == _N
    sort year compid
    Now, there is some ambiguity about the variable new in year 2000. Since that is the first year in the data, everybody present in 2000 is marked as new. But you don't really know if they are entering in 2000 or were there before. So you might want to do something like -replace new = . if year == 2000-. Or you might want to work with it the way it is now. It really depends on what your research goals are.

    Additionally, in the future please use -dataex- to post example data, as I have done above. While the particular table you set out in your post imported easily enough into Stata, often they can be very difficult to work with. Run -ssc install dataex- to install the -dataex- command and then run -help dataex- to read the simple instructions for using it. By using -dataex- you enable those who want to help you to create a completely faithful replica of your example data in Stata with just a simple copy/paste operation.

    Added: The code I showed would not identify the situation where somebody leaves and then returns later. If that situation exists in your data and needs to be dealt with, post back and I can modify the code accordingly.

    Comment


    • #3
      You can lag each observation by one year and merge back with the original data for find out precisely who is in both, who's new, and who left.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int year str8 company int compid str10 personname int personid
      2000 "ABC inc." 1234 "Anderson"   24364
      2000 "ABC inc." 1234 "Hemp"       22880
      2000 "ABC inc." 1234 "Davis"       3879
      2000 "ABC inc." 1234 "Switz"      10203
      2000 "ABC inc." 1234 "Cadogan"        6
      2000 "ABC inc." 1234 "Ford"         913
      2000 "ABC inc." 1234 "Barker"     24362
      2000 "ABC inc." 1234 "Sobti"      20624
      2001 "ABC inc." 1234 "O'Brien"    24363
      2001 "ABC inc." 1234 "Hemp"       22880
      2001 "ABC inc." 1234 "Cadogan"        6
      2001 "ABC inc." 1234 "Anderson"   24364
      2001 "ABC inc." 1234 "Davis"       3879
      2001 "ABC inc." 1234 "Sniff"      25854
      2001 "ABC inc." 1234 "Switz"      10203
      2001 "ABC inc." 1234 "Barker"     24362
      2001 "ABC inc." 1234 "Ford"         913
      2001 "ABC inc." 1234 "Roscitt"    22879
      2000 "DEF inc." 2184 "London"     24784
      2000 "DEF inc." 2184 "Miller"      6175
      2000 "DEF inc." 2184 "Fenn"        6177
      2000 "DEF inc." 2184 "Freeland"   23443
      2000 "DEF inc." 2184 "Schoonover" 24785
      2000 "DEF inc." 2184 "Lenzmeier"   6176
      2000 "DEF inc." 2184 "Keskey"     15375
      2000 "DEF inc." 2184 "Walden"     21726
      2000 "DEF inc." 2184 "Schulze"      126
      2001 "DEF inc." 2184 "Fenn"        6177
      2001 "DEF inc." 2184 "London"     24784
      2001 "DEF inc." 2184 "Keskey"     15375
      2001 "DEF inc." 2184 "Freeland"   23443
      2001 "DEF inc." 2184 "Lenzmeier"   6176
      2001 "DEF inc." 2184 "Jackson"    13283
      2001 "DEF inc." 2184 "Schoonover" 24785
      2001 "DEF inc." 2184 "Miller"      6175
      2001 "DEF inc." 2184 "Schulze"      126
      end
      save "master.dta", replace
      
      * check that year company personid uniquely identify observations
      isid year company personid, sort
      
      * lag data by one year
      keep year company personid personname
      replace year = year - 1
      rename personname Lpersonname
      drop if year == year[1]
      save "lagged.dta", replace
      
      use "master.dta"
      merge 1:1 year company personid using "lagged.dta"
      sort year company personid
      
      * calculate totals 
      by year company: egen inboth = total(_merge == 3)
      by year company: egen previous_only = total(_merge == 2)
      by year company: egen current_only = total(_merge == 1)

      Comment


      • #4
        Dear Clyde, dear Robert,
        thank you very much for your quick responses and help!!
        I first tried to implement Clyde’s command code by doing the following:
        My research starts in the year 2000 – but I do have the team composition data for all companies of interest for the year 1999, too!
        So I appended team compositions of the year 1999 for all companies to my main-dataset.
        Then I executed the following commands:
        use maindataset.dta
        append using 1999er_teams

        by compid personname (year), sort: gen byte new = _n == 1
        by compid personname (year): gen byte exiting = _n == _N
        sort comp fiscyear
        drop if fiscyear==1999

        As I don’t need the year 1999 I just dropped it afterwards 😊

        A point that came to my mind about the variables new and exiting was:
        • New shows a 1 in the respective year the new name appears the first time which is totally fine!
        • But Exiting shows a 1 in the year the name is there the last time (but in that year the person is still there and might not be exiting.) The 1 should be seen as belonging to the following year, because the name disappears the next year and not the current year where the person is there for the last time.
        I am not sure if this might be an issue.
        Maybe the variable Exiting should be lagged one year - but how can this be done when there is no name to assign a "1" to in the next year? (Is there a formula that would do this for me?)
        If not, I will leave it like that.



        So now I almost have what I wanted.
        In a last step, I need to calculate a variable called fluctuation or turnover.
        I want to calculate it as a percentage of the “1s” summed up over the 2 variables “new & exiting” and divided by the respective team sizes in this group.
        year new exiting teamsize sum new&exiting fluctuation
        2000 1 0 6 3 0.5
        2000 0 0 6 3 0.5
        2000 0 0 6 3 0.5
        2000 0 0 6 3 0.5
        2000 1 0 6 3 0.5
        2000 0 1 6 3 0.5
        (I installed dataex but I didn't know how to convert this table here, so I hope it is ok for this time)

        Again, Thank You both very much for your time and help - I know my questions may be really low-level - but I am so glad, that you help me anyway!
        I really appreciate that!

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          So I think the following code does what you want:
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input int year str8 company int compid str10 personname int personid
          2000 "ABC inc." 1234 "Anderson" 24364
          2000 "ABC inc." 1234 "Hemp" 22880
          2000 "ABC inc." 1234 "Davis" 3879
          2000 "ABC inc." 1234 "Switz" 10203
          2000 "ABC inc." 1234 "Cadogan" 6
          2000 "ABC inc." 1234 "Ford" 913
          2000 "ABC inc." 1234 "Barker" 24362
          2000 "ABC inc." 1234 "Sobti" 20624
          2001 "ABC inc." 1234 "O'Brien" 24363
          2001 "ABC inc." 1234 "Hemp" 22880
          2001 "ABC inc." 1234 "Cadogan" 6
          2001 "ABC inc." 1234 "Anderson" 24364
          2001 "ABC inc." 1234 "Davis" 3879
          2001 "ABC inc." 1234 "Sniff" 25854
          2001 "ABC inc." 1234 "Switz" 10203
          2001 "ABC inc." 1234 "Barker" 24362
          2001 "ABC inc." 1234 "Ford" 913
          2001 "ABC inc." 1234 "Roscitt" 22879
          2000 "DEF inc." 2184 "London" 24784
          2000 "DEF inc." 2184 "Miller" 6175
          2000 "DEF inc." 2184 "Fenn" 6177
          2000 "DEF inc." 2184 "Freeland" 23443
          2000 "DEF inc." 2184 "Schoonover" 24785
          2000 "DEF inc." 2184 "Lenzmeier" 6176
          2000 "DEF inc." 2184 "Keskey" 15375
          2000 "DEF inc." 2184 "Walden" 21726
          2000 "DEF inc." 2184 "Schulze" 126
          2001 "DEF inc." 2184 "Fenn" 6177
          2001 "DEF inc." 2184 "London" 24784
          2001 "DEF inc." 2184 "Keskey" 15375
          2001 "DEF inc." 2184 "Freeland" 23443
          2001 "DEF inc." 2184 "Lenzmeier" 6176
          2001 "DEF inc." 2184 "Jackson" 13283
          2001 "DEF inc." 2184 "Schoonover" 24785
          2001 "DEF inc." 2184 "Miller" 6175
          2001 "DEF inc." 2184 "Schulze" 126
          end
          
          by compid personid (year), sort: gen byte new = _n == 1
          by compid personid (year): gen byte exiting = _n == _N
          sort year compid
          Now, there is some ambiguity about the variable new in year 2000. Since that is the first year in the data, everybody present in 2000 is marked as new. But you don't really know if they are entering in 2000 or were there before. So you might want to do something like -replace new = . if year == 2000-. Or you might want to work with it the way it is now. It really depends on what your research goals are.

          Additionally, in the future please use -dataex- to post example data, as I have done above. While the particular table you set out in your post imported easily enough into Stata, often they can be very difficult to work with. Run -ssc install dataex- to install the -dataex- command and then run -help dataex- to read the simple instructions for using it. By using -dataex- you enable those who want to help you to create a completely faithful replica of your example data in Stata with just a simple copy/paste operation.

          Added: The code I showed would not identify the situation where somebody leaves and then returns later. If that situation exists in your data and needs to be dealt with, post back and I can modify the code accordingly.
          Dear Clyde, dear Robert,
          thank you very much for your quick responses and help!!
          I first tried to implement Clyde’s command code by doing the following:
          My research starts in the year 2000 – but I do have the team composition data for all companies of interest for the year 1999, too!
          So I appended team compositions of the year 1999 for all companies to my main-dataset.
          Then I executed the following commands:
          use maindataset.dta
          append using 1999er_teams

          by compid personname (year), sort: gen byte new = _n == 1
          by compid personname (year): gen byte exiting = _n == _N
          sort comp fiscyear
          drop if fiscyear==1999

          As I don’t need the year 1999 I just dropped it afterwards 😊

          A point that came to my mind about the variables new and exiting was:
          • New shows a 1 in the respective year the new name appears the first time which is totally fine!
          • But Exiting shows a 1 in the year the name is there the last time (but in that year the person is still there and might not be exiting.) The 1 should be seen as belonging to the following year, because the name disappears the next year and not the current year where the person is there for the last time.
          I am not sure if this might be an issue.
          Maybe the variable Exiting should be lagged one year - but how can this be done when there is no name to assign a "1" to in the next year? (Is there a formula that would do this for me?)
          If not, I will leave it like that.



          So now I almost have what I wanted.
          In a last step, I need to calculate a variable called fluctuation or turnover.
          I want to calculate it as a percentage of the “1s” summed up over the 2 variables “new & exiting” and divided by the respective team sizes in this group.
          year new exiting teamsize sum new&exiting fluctuation
          2000 1 0 6 3 0.5
          2000 0 0 6 3 0.5
          2000 0 0 6 3 0.5
          2000 0 0 6 3 0.5
          2000 1 0 6 3 0.5
          2000 0 1 6 3 0.5
          (I installed dataex but I didn't know how to convert this table here, so I hope it is ok for this time)

          Again, Thank You both very much for your time and help - I know my questions may be really low-level - but I am so glad, that you help me anyway!
          I really appreciate that!

          Comment


          • #6
            Well, -gen fluctuation = (new + exiting)/teamsize-; but I think you knew that and that isn't what you mean to ask. I guess the question is how to create the team size variable. And I'm not sure how to answer that question based on your data. Is it the total number of distinct personid's within a company across all years? If so:

            Code:
            by compid personid, sort: gen distinct_persons = 1 if _n == 1
            by compid: replace distinct_persons = sum(distinct_persons)
            by compid: gen teamsize = distinct_persons[_N]

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              Well, -gen fluctuation = (new + exiting)/teamsize-; but I think you knew that and that isn't what you mean to ask. I guess the question is how to create the team size variable. And I'm not sure how to answer that question based on your data. Is it the total number of distinct personid's within a company across all years? If so:

              Code:
              by compid personid, sort: gen distinct_persons = 1 if _n == 1
              by compid: replace distinct_persons = sum(distinct_persons)
              by compid: gen teamsize = distinct_persons[_N]
              Dear Clyde,

              thank you for your response!
              The Teamsize I already calculated with the following formular:

              *************************************
              * *
              * Variable Team Size *
              * *
              *************************************
              by compid fiscyear, sort: generate teamsize= _N

              This worked out for me when I checked everything with the browse-window.



              What I want to do now is:
              For a company and a respective fiscalyear - sum up all exits and entries (exiting+new) and divide this sum by the teamsizevariable.
              If you look at the table here you cann see what I mean.
              I used your formula: by company fiscyear, sort: gen turnover = (new+exiting)/teamsize and this gave me the variable "turnover WRONG" (just to show you here)
              What I want is the variable with an output like in "turnover CORRECT" - summing up all new&exiting for one company in one fiscal year and then divide by the respective teamsize in this year, which is of course the same number in every line of that year (paneldata)
              company fiscyear execname new exiting teamsize turnover WRONG turnover
              CORRECT
              ABC inc. 2000 Mr.A 0 0 8 0 0,375
              ABC inc. 2000 Mr. B 0 0 8 0 0,375
              ABC inc. 2000 Mr. D 0 0 8 0 0,375
              ABC inc. 2000 Mr. E 1 0 8 .125 0,375
              ABC inc. 2000 Mr. F 0 0 8 0 0,375
              ABC inc. 2000 Mr. C 1 0 8 .125 0,375
              ABC inc. 2000 Mr. G 0 1 8 .125 0,375
              ABC inc. 2000 Mr. H 0 0 8 0 0,375
              ABC inc. 2001 Mr. I 1 0 10 .1 0,7
              ABC inc. 2001 Mr.A 0 1 10 .1 0,7
              ABC inc. 2001 Mr. H 0 1 10 .1 0,7
              ABC inc. 2001 Mr. J 1 0 10 .1 0,7
              ABC inc. 2001 Mr. F 0 1 10 .1 0,7
              ABC inc. 2001 Mr. E 0 0 10 0 0,7
              ABC inc. 2001 Mr. D 0 0 10 0 0,7
              ABC inc. 2001 Mr. C 0 0 10 0 0,7
              ABC inc. 2001 Mr. K 1 0 10 .1 0,7
              ABC inc. 2001 Mr. B 0 1 10 .1 0,7

              Thank you so much for helping me out!
              Kind regards,
              Svenja

              Comment


              • #8
                Code:
                by company fiscyear, sort: egen turnover = total(new+exiting)
                replace turnover = turnover/teamsize
                Do read the manual section on -egen-. It is bristling with useful data management functions that will help you in your work with Stata on a regular basis.

                Comment


                • #9
                  Originally posted by Clyde Schechter View Post
                  Code:
                  by company fiscyear, sort: egen turnover = total(new+exiting)
                  replace turnover = turnover/teamsize
                  Do read the manual section on -egen-. It is bristling with useful data management functions that will help you in your work with Stata on a regular basis.
                  Thank you very much Clyde!
                  This worked out perfectly for me and solved my problem!

                  I will read this manual section.

                  Have a nice day

                  Comment


                  • #10
                    Originally posted by Svenja Marie View Post

                    Thank you very much Clyde!
                    This worked out perfectly for me and solved my problem!

                    I will read this manual section.

                    Have a nice day
                    Hi, this code perfectly solves the problem with the given data set. But, when we add one more year (say 2002) and some of the exiting people reenter into the team in this fiscal year(2002), there occurs a problem with these people in recording them as "new" this year. How can modify the code with alternative scenario?


                    Demet Korkut

                    Comment


                    • #11
                      Hi, this code perfectly solves the problem with the given data set. But, when we add one more year (say 2002) and some of the exiting people reenter into the team in this fiscal year(2002), there occurs a problem with these people in recording them as "new" this year. How can modify the code with alternative scenario?
                      Please post a new example data set (using -dataex-, of course) that illustrates this problem. And elaborate what your concern is. When some of the exiting people reenter, why would you not count them as "new" for the purposes of calculating turnover? And what would you do with them instead? What if they returned 2 years later? Or 5? Or 10?....

                      Comment


                      • #12
                        Originally posted by Clyde Schechter View Post
                        Please post a new example data set (using -dataex-, of course) that illustrates this problem. And elaborate what your concern is. When some of the exiting people reenter, why would you not count them as "new" for the purposes of calculating turnover? And what would you do with them instead? What if they returned 2 years later? Or 5? Or 10?....
                        I simply modified the data set in this post according my question:
                        clear
                        input int year str8 company int compid str10 personname int personid byte(new exiting)
                        2000 "ABC inc." 1234 "Ford" 913 1 0
                        2000 "ABC inc." 1234 "Switz" 10203 1 0
                        2000 "ABC inc." 1234 "Barker" 24362 1 0
                        2000 "ABC inc." 1234 "Hemp" 22880 1 0
                        2000 "ABC inc." 1234 "Cadogan" 6 1 0
                        2000 "ABC inc." 1234 "Anderson" 24364 1 0
                        2000 "ABC inc." 1234 "Davis" 3879 1 0
                        2000 "ABC inc." 1234 "Sobti" 20624 1 0
                        2001 "ABC inc." 1234 "Davis" 3879 0 0
                        2001 "ABC inc." 1234 "Sniff" 25854 1 1
                        2001 "ABC inc." 1234 "Roscitt" 22879 1 1
                        2001 "ABC inc." 1234 "O'Brien" 24363 1 1
                        2001 "ABC inc." 1234 "Switz" 10203 0 0
                        2001 "ABC inc." 1234 "Ford" 913 0 0
                        2001 "ABC inc." 1234 "Barker" 24362 0 1
                        2001 "ABC inc." 1234 "Hemp" 22880 0 0
                        2001 "ABC inc." 1234 "Anderson" 24364 0 0
                        2001 "ABC inc." 1234 "Cadogan" 6 0 1
                        2002 "ABC inc." 1234 "Anderson" 24364 0 1
                        2002 "ABC inc." 1234 "Sobti" 20624 0 1
                        2002 "ABC inc." 1234 "Switz" 10203 0 1
                        2002 "ABC inc." 1234 "Hemp" 22880 0 1
                        2002 "ABC inc." 1234 "Ford" 913 0 1
                        2002 "ABC inc." 1234 "B" 33 1 1
                        2002 "ABC inc." 1234 "A" 22 1 1
                        2002 "ABC inc." 1234 "Davis" 3879 0 1
                        2003 "ABC inc." 1234 "D" 234 1 1
                        2003 "ABC inc." 1234 "C" 21 1 1
                        2000 "DEF inc." 2184 "London" 24784 1 0
                        2000 "DEF inc." 2184 "Miller" 6175 1 0
                        2000 "DEF inc." 2184 "Freeland" 23443 1 0
                        2000 "DEF inc." 2184 "Schoonover" 24785 1 0
                        2000 "DEF inc." 2184 "Schulze" 126 1 0
                        2000 "DEF inc." 2184 "Fenn" 6177 1 0
                        2000 "DEF inc." 2184 "Keskey" 15375 1 0
                        2000 "DEF inc." 2184 "Walden" 21726 1 1
                        2000 "DEF inc." 2184 "Lenzmeier" 6176 1 0
                        2001 "DEF inc." 2184 "Keskey" 15375 0 1
                        2001 "DEF inc." 2184 "London" 24784 0 1
                        2001 "DEF inc." 2184 "Schulze" 126 0 1
                        2001 "DEF inc." 2184 "Freeland" 23443 0 0
                        2001 "DEF inc." 2184 "Jackson" 13283 1 0
                        2001 "DEF inc." 2184 "Lenzmeier" 6176 0 1
                        2001 "DEF inc." 2184 "Fenn" 6177 0 1
                        2001 "DEF inc." 2184 "Schoonover" 24785 0 0
                        2001 "DEF inc." 2184 "Miller" 6175 0 1
                        2003 "" 2184 "Schoonover" 24785 0 1
                        2003 "" 2184 "Jackson" 13283 0 1
                        2003 "" 2184 "Freeland" 23443 0 1
                        end
                        [/CODE](
                        I run the code you recommended above and obtained "new" and "exiting" variables. Now take the person name "Sobti" with Id "20624". In my dataset covering 2000-2003, he exit in 2000 (because dos not appear in 2001) and reenters in 2002 again. But the observation point in "new" and "exiting" variables do not correspond his situation. For the year 2000, "no exit" appears in this scenario, but actually it is.

                        Actually, my original dataset which I work on contain firms' exixtence and accounts within several sectors covering 2006-2020. I have already posted my sample dataset here (Statalist) with the title "Entry and exit rates of firms in various years" (or like that). For each year I want to calculate entry and exit rates ((enty+exit)/total number of firms). I wrote in this post because it is completely relevant with my purpose. With code I obtained a result that once a firm enters into the sector, it seems it operates till the last year (2020) even if it exits and reenter in various years between 20006-2020.

                        Demet

                        Comment


                        • #13
                          Sorry, I copied wrong in my last post. here is the relevant dataset:

                          Code:
                          * Example generated by -dataex-. To install: ssc install dataex
                          clear
                          input int year str8 company int compid str10 personname int personid byte(new exiting)
                          2000 "ABC inc." 1234 "Ford"         913 1 0
                          2000 "ABC inc." 1234 "Switz"      10203 1 0
                          2000 "ABC inc." 1234 "Barker"     24362 1 0
                          2000 "ABC inc." 1234 "Hemp"       22880 1 0
                          2000 "ABC inc." 1234 "Cadogan"        6 1 0
                          2000 "ABC inc." 1234 "Anderson"   24364 1 0
                          2000 "ABC inc." 1234 "Davis"       3879 1 0
                          2000 "ABC inc." 1234 "Sobti"      20624 1 0
                          2001 "ABC inc." 1234 "Davis"       3879 0 0
                          2001 "ABC inc." 1234 "Sniff"      25854 1 1
                          2001 "ABC inc." 1234 "Roscitt"    22879 1 1
                          2001 "ABC inc." 1234 "O'Brien"    24363 1 1
                          2001 "ABC inc." 1234 "Switz"      10203 0 0
                          2001 "ABC inc." 1234 "Ford"         913 0 0
                          2001 "ABC inc." 1234 "Barker"     24362 0 1
                          2001 "ABC inc." 1234 "Hemp"       22880 0 0
                          2001 "ABC inc." 1234 "Anderson"   24364 0 0
                          2001 "ABC inc." 1234 "Cadogan"        6 0 1
                          2002 "ABC inc." 1234 "Anderson"   24364 0 1
                          2002 "ABC inc." 1234 "Sobti"      20624 0 1
                          2002 "ABC inc." 1234 "Switz"      10203 0 1
                          2002 "ABC inc." 1234 "Hemp"       22880 0 1
                          2002 "ABC inc." 1234 "Ford"         913 0 1
                          2002 "ABC inc." 1234 "B"             33 1 1
                          2002 "ABC inc." 1234 "A"             22 1 1
                          2002 "ABC inc." 1234 "Davis"       3879 0 1
                          2003 "ABC inc." 1234 "D"            234 1 1
                          2003 "ABC inc." 1234 "C"             21 1 1
                          2000 "DEF inc." 2184 "London"     24784 1 0
                          2000 "DEF inc." 2184 "Miller"      6175 1 0
                          2000 "DEF inc." 2184 "Freeland"   23443 1 0
                          2000 "DEF inc." 2184 "Schoonover" 24785 1 0
                          2000 "DEF inc." 2184 "Schulze"      126 1 0
                          2000 "DEF inc." 2184 "Fenn"        6177 1 0
                          2000 "DEF inc." 2184 "Keskey"     15375 1 0
                          2000 "DEF inc." 2184 "Walden"     21726 1 1
                          2000 "DEF inc." 2184 "Lenzmeier"   6176 1 0
                          2001 "DEF inc." 2184 "Keskey"     15375 0 1
                          2001 "DEF inc." 2184 "London"     24784 0 1
                          2001 "DEF inc." 2184 "Schulze"      126 0 1
                          2001 "DEF inc." 2184 "Freeland"   23443 0 0
                          2001 "DEF inc." 2184 "Jackson"    13283 1 0
                          2001 "DEF inc." 2184 "Lenzmeier"   6176 0 1
                          2001 "DEF inc." 2184 "Fenn"        6177 0 1
                          2001 "DEF inc." 2184 "Schoonover" 24785 0 0
                          2001 "DEF inc." 2184 "Miller"      6175 0 1
                          2003 ""         2184 "Schoonover" 24785 0 1
                          2003 ""         2184 "Jackson"    13283 0 1
                          2003 ""         2184 "Freeland"   23443 0 1
                          end

                          Comment


                          • #14
                            So, I take it your question is how to fix the new and exiting variables because they do not currently reflect the actual entry and exit status.

                            Code:
                            drop new exiting
                            xtset personid year
                            gen byte new = L1.compid != compid
                            gen byte exiting = F1.compid != compid

                            Comment


                            • #15
                              Originally posted by Clyde Schechter View Post
                              So, I take it your question is how to fix the new and exiting variables because they do not currently reflect the actual entry and exit status.

                              Code:
                              drop new exiting
                              xtset personid year
                              gen byte new = L1.compid != compid
                              gen byte exiting = F1.compid != compid
                              Thank you, this code works well.

                              Comment

                              Working...
                              X