Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Nested loops for Blau Index Nationalities (big dataset)

    Hi all,

    For information purposes: I am using STATA/MP version 15.1.

    I had a question regarding (nested) loops in Stata to calculate the Blau Index Nationality Diversity within a team.
    However, I am not able to figure out how to write my loops to handle the data correctly.
    I want to create a nationality diversity variable for each year/month... for the present team members.
    In other words if member was present (year`i'== 1) than calculate the nationality diversity.
    But, I am not sure how to adapt the following code:


    Code:
    .forvalues i= 0/5 {
    
    tabulate Nationality if year`i'== 1, generate(Nationality_Year`i'_)
    }
    
    foreach var of varlist Nationality_Year* {
    bysort BvdIdNumber: egen sum_`var' =total(`var')
    }
    
    drop Nationality_Year*
    
    forvalues i= 0/5 {
    egen total_Nationality_Year`i' = rowtotal(sum_Nationality_Year`i'_*)
    }
    
    forvalues j = 1/150 {
        forvalues i = 0/5 {
    gen p_year`i'_`j' = (sum_Nationality_Year`i'_`j'/ total_Nationality_Year`i')^2
    
    *drop sum_Nationality_Year*
    *drop total_Nationality_Year*
        }
    }
    
    forvalues i = 0/5 {
    egen sum_squares_year`i'= rowtotal( p_year`i'_`j')
    egen sum_squares_year`i'= rowtotal( p_year`i'_`j')
    egen sum_squares_year`i'= rowtotal( p_year`i'_`j')
    egen sum_squares_year`i'= rowtotal( p_year`i'_`j')
    egen sum_squares_year`i'= rowtotal( p_year`i'_`j')
    egen sum_squares_year`i'= rowtotal( p_year`i'_`j')
    }
    
    
    *drop p_*
    
    forvalues j = 1/150 {
        forvalues i = 0/5 {
    gen Nationality_Year`i' = 1 - sum_squares_year`i'
    gen Nationality_Year`i'= 1 - sum_squares_year`i'
    gen Nationality_Year`i'= 1 - sum_squares_year`i'
    gen Nationality_Year`i'= 1 - sum_squares_year`i'
    gen Nationality_Year`i'= 1 - sum_squares_year`i'
    gen Nationality_Year`i' = 1 - sum_squares_year`i'
    
    *drop sum_squares_*
        }
    }
    I used forvalues and select a value greater than the total variety of nationalities in my dataset, as I am not able to figure out how to integrate' foreach var' well due to my lack of knowledge.

    I based myself mainly on the following discussion post, post #2:
    https://www.statalist.org/forums/forum/general-stata-discussion/general/1495309-blau-index


    Would someone be able to help me with this issue/ guide me in the right direction? If there is a better/easier solution than this, I am very happy to learn.
    If any more info is needed/better explanation, let me know!
    Thanks in advance!

    Best regards,
    Laura


    -------------------------------------------------------------------------------------------

    Below an excerpt of my data using dataex:

    BvdIdNumber = company number
    DMUci = employee number
    year... = dummy variable that tells me if member was present year 0,1...,16
    Nationality = nationality of team member


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str16 BvdIdNumber str10 DMUci str27 Nationality float(year0 year1 year2 year3 year4 year5 year6 year7 year8 year9 year10 year11 year12 year13 year14 year15 year16)
    "AT9070422953" "P300272480" "Austria"        0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "AT9070422953" "P301627755" "Austria"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0501515635" "P144431731" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0501515635" "P238808799" "Belgium"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0501562254" "P044740365" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0501562254" "P044740365" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0501562254" "P044740365" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0501562254" "P123993699" "Belgium"        0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0501562254" "P123993699" "Belgium"        0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0501562254" "P123993699" "Belgium"        0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0505804718" "P103923129" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0505804718" "P103923129" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0505804718" "P238328579" "Japan"          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0505804718" "P238328579" "Japan"          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0505804718" "P246173452" "Japan"          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0505804718" "P246173452" "Japan"          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0518917237" "P042037775" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0518917237" "P042037775" "Belgium"        0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
    "BE0518917237" "P044189380" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0536436625" "P009351640" "Canada"         0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0536436625" "P009351640" "Canada"         0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0536436625" "P009351700" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0536436625" "P009351700" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0536436625" "P014424187" "Canada"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0536436625" "P014424187" "Canada"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0536436625" "P048688796" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0536436625" "P048688796" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0555980046" "P039659380" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0555980046" "P076509994" "France"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0656918246" "P141888237" "Bulgaria"       1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0656918246" "P141888237" "Bulgaria"       1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0656918246" "P349820174" "Bulgaria"       1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0656918246" "P349820174" "Bulgaria"       1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0657890622" "P077578372" "France"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0657890622" "P350651532" "France"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0664799594" "P006533911" "Greece"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0664799594" "P006533913" "United Kingdom" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0664799594" "P263565204" "Greece"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0808687713" "P043459088" "Denmark"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0808687713" "P043459088" "Denmark"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0808687713" "P043459088" "Denmark"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0808687713" "P043459088" "Denmark"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0808687713" "P043459088" "Denmark"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0808687713" "P068611593" "Denmark"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0808687713" "P068611593" "Denmark"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0808687713" "P068611593" "Denmark"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0808687713" "P068611593" "Denmark"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0808687713" "P068611593" "Denmark"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0810603264" "P123366320" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0810603264" "P123366320" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0810603264" "P123574091" "Belgium"        0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0810603264" "P123574091" "Belgium"        0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0812211682" "P046290225" "United Kingdom" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0812211682" "P046290228" "United Kingdom" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P001050215" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P001050215" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P001050215" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P001050215" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P001068111" "Germany"        0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P001068111" "Germany"        0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P194531680" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P194531680" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P194531680" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P194531680" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P010015992" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P010015992" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P194256668" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P194256668" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P210520024" "United States"  0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P210520024" "United States"  0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P300158435" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0822663730" "P300158435" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0822663730" "P415739933" "United States"  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0822663730" "P415739933" "United States"  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0822663730" "P469602315" "United States"  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0822663730" "P469602315" "United States"  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0822726086" "P049564102" "Cyprus"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822726086" "P049564102" "Cyprus"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822726086" "P049564102" "Cyprus"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822726086" "P063938116" "Cyprus"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822726086" "P063938116" "Cyprus"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822726086" "P063938116" "Cyprus"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0825343207" "P005145231" "United Kingdom" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0825343207" "P005145231" "United Kingdom" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0825343207" "P005145231" "United Kingdom" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0825343207" "P041721435" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0825343207" "P041721435" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0825343207" "P041721435" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0830917737" "P005223488" "France"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0830917737" "P005223488" "France"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0830917737" "P007073447" "France"         0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0830917737" "P007073447" "France"         0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P011838447" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P011838447" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P019045531" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P019045531" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P116393668" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P116393668" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P116393668" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    end

    I used the following code when only accounting for the year the company started/failed (so without looping) and it seems to work, but might not be perfect.

    Code:
    // Year of founding
    
    tabulate Nationality if FoundingTeam == 1, generate(nationality_founding_)
    foreach var of varlist nationality_founding_* {
    bysort BvdIdNumber: egen sum`var' =total(`var')
    }
    
    drop nationality_*
    egen totalnationality_founding = rowtotal(sumnationality_founding_*)
    forvalues i = 1/177 {
    gen p_founding_`i' = (sumnationality_founding_`i'/ totalnationality_founding )^2
    }
    
    drop sumnationality*
    drop totalnationality*
    
    egen sum_squares_founding = rowtotal(p_founding_*)
    drop p_*
    
    gen Nationality_founding = 1 - sum_squares_founding
    drop sum_squares_*
    
    
    // Year of failure
    tabulate Nationality if FailureTeam == 1, generate(nationality_failure_)
    foreach var of varlist nationality_failure_* {
    bysort BvdIdNumber: egen sum`var' =total(`var')
    }
    
    drop nationality_*
    egen totalnationality_failure = rowtotal(sumnationality_failure_*)
    forvalues i = 1/177 {
    gen p_failure_`i' = (sumnationality_failure_`i'/ totalnationality_failure )^2
    }
    
    drop sumnationality*
    drop totalnationality*
    
    egen sum_squares_failure = rowtotal(p_failure_*)
    drop p_*
    
    gen Nationality_failure = 1 - sum_squares_failure
    drop sum_squares_*


    Last edited by Laura Hill; 23 Mar 2023, 03:05. Reason: Added tags

  • #2
    It's hard for me to follow this lengthy code. As I understand it the Blau index carries the same information as the Hirschman-Herfindahl index from economics, the reason for these names being that the measure was (essentially) invented much earlier by Gini (whose name is also used, and even more confusingly for several quite different measures). Whatever the beast is called

    https://www.statalist.org/forums/for...f-observations

    gives concise code for the HHI and gives the flavour I think is helpful here, namely egen is your friend.

    Backing up: I understand the problem to be

    calculate Blau for each company and each year, taking account only of employees present in that year

    I did this with your very helpful data example:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str16 BvdIdNumber str10 DMUci str27 Nationality float(year0 year1 year2 year3 year4 year5 year6 year7 year8 year9 year10 year11 year12 year13 year14 year15 year16)
    "AT9070422953" "P300272480" "Austria"        0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "AT9070422953" "P301627755" "Austria"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0501515635" "P144431731" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0501515635" "P238808799" "Belgium"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0501562254" "P044740365" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0501562254" "P044740365" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0501562254" "P044740365" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0501562254" "P123993699" "Belgium"        0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0501562254" "P123993699" "Belgium"        0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0501562254" "P123993699" "Belgium"        0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0505804718" "P103923129" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0505804718" "P103923129" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0505804718" "P238328579" "Japan"          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0505804718" "P238328579" "Japan"          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0505804718" "P246173452" "Japan"          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0505804718" "P246173452" "Japan"          0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0518917237" "P042037775" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0518917237" "P042037775" "Belgium"        0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
    "BE0518917237" "P044189380" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0536436625" "P009351640" "Canada"         0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0536436625" "P009351640" "Canada"         0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0536436625" "P009351700" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0536436625" "P009351700" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0536436625" "P014424187" "Canada"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0536436625" "P014424187" "Canada"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0536436625" "P048688796" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0536436625" "P048688796" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0555980046" "P039659380" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0555980046" "P076509994" "France"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0656918246" "P141888237" "Bulgaria"       1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0656918246" "P141888237" "Bulgaria"       1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0656918246" "P349820174" "Bulgaria"       1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0656918246" "P349820174" "Bulgaria"       1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0657890622" "P077578372" "France"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0657890622" "P350651532" "France"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0664799594" "P006533911" "Greece"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0664799594" "P006533913" "United Kingdom" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0664799594" "P263565204" "Greece"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0808687713" "P043459088" "Denmark"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0808687713" "P043459088" "Denmark"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0808687713" "P043459088" "Denmark"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0808687713" "P043459088" "Denmark"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0808687713" "P043459088" "Denmark"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0808687713" "P068611593" "Denmark"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0808687713" "P068611593" "Denmark"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0808687713" "P068611593" "Denmark"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0808687713" "P068611593" "Denmark"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0808687713" "P068611593" "Denmark"        0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0810603264" "P123366320" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0810603264" "P123366320" "Belgium"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0810603264" "P123574091" "Belgium"        0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0810603264" "P123574091" "Belgium"        0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0812211682" "P046290225" "United Kingdom" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0812211682" "P046290228" "United Kingdom" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P001050215" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P001050215" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P001050215" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P001050215" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P001068111" "Germany"        0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P001068111" "Germany"        0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P194531680" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P194531680" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P194531680" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0818085924" "P194531680" "Germany"        1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P010015992" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P010015992" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P194256668" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P194256668" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P210520024" "United States"  0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P210520024" "United States"  0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
    "BE0822663730" "P300158435" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0822663730" "P300158435" "United Kingdom" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0822663730" "P415739933" "United States"  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0822663730" "P415739933" "United States"  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0822663730" "P469602315" "United States"  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0822663730" "P469602315" "United States"  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    "BE0822726086" "P049564102" "Cyprus"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822726086" "P049564102" "Cyprus"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822726086" "P049564102" "Cyprus"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822726086" "P063938116" "Cyprus"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822726086" "P063938116" "Cyprus"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0822726086" "P063938116" "Cyprus"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0825343207" "P005145231" "United Kingdom" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0825343207" "P005145231" "United Kingdom" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0825343207" "P005145231" "United Kingdom" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0825343207" "P041721435" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0825343207" "P041721435" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0825343207" "P041721435" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0830917737" "P005223488" "France"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0830917737" "P005223488" "France"         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0830917737" "P007073447" "France"         0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0830917737" "P007073447" "France"         0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P011838447" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P011838447" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P019045531" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P019045531" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P116393668" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P116393668" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    "BE0835023906" "P116393668" "United States"  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
    end
    
    * see help for -egen- for what this does 
    egen tag = tag(BvdIdNumber Nationality)
    gen p = . 
    
    quietly forval y = 0/16 {
        * a count is just identical to the sum of an indicator variable 
        bysort BvdIdNumber Nationality : egen n = total(year`y')
        by BvdIdNumber : egen N = total(year`y')
        
        * the proportion is just the ratio of those counts 
        replace p = n/N 
        
        * crucially, each proportion must be used once only
        * values with tag 0 contribute 0 to the total, and are ignored 
        by BvdIdNumber : egen Blau`y' = total(tag * p^2)
        
        * want complement 
        replace Blau`y' = 1 - Blau`y'
        
        * egen doesn't work with replace 
        * we keep n N for the last year looked at so we can check results 
        if `y' < 16 drop n N 
    }
    
    
    
    . list BvdIdNumber Nationality n N Blau16 if year16, sepby(Bvd)
    
         +----------------------------------------------------+
         |  BvdIdNumber      Nationality    n    N     Blau16 |
         |----------------------------------------------------|
      2. | AT9070422953          Austria    1    1          0 |
         |----------------------------------------------------|
      4. | BE0501515635          Belgium    1    1          0 |
         |----------------------------------------------------|
      5. | BE0501562254          Belgium    6    6          0 |
      6. | BE0501562254          Belgium    6    6          0 |
      7. | BE0501562254          Belgium    6    6          0 |
      8. | BE0501562254          Belgium    6    6          0 |
      9. | BE0501562254          Belgium    6    6          0 |
     10. | BE0501562254          Belgium    6    6          0 |
         |----------------------------------------------------|
     11. | BE0505804718          Germany    2    2          0 |
     12. | BE0505804718          Germany    2    2          0 |
         |----------------------------------------------------|
     17. | BE0518917237          Belgium    3    3          0 |
     18. | BE0518917237          Belgium    3    3          0 |
     19. | BE0518917237          Belgium    3    3          0 |
         |----------------------------------------------------|
     20. | BE0536436625           Canada    2    2          0 |
     23. | BE0536436625           Canada    2    2          0 |
         |----------------------------------------------------|
     28. | BE0555980046           France    1    1          0 |
         |----------------------------------------------------|
     30. | BE0656918246         Bulgaria    4    4          0 |
     31. | BE0656918246         Bulgaria    4    4          0 |
     32. | BE0656918246         Bulgaria    4    4          0 |
     33. | BE0656918246         Bulgaria    4    4          0 |
         |----------------------------------------------------|
     34. | BE0657890622           France    2    2          0 |
     35. | BE0657890622           France    2    2          0 |
         |----------------------------------------------------|
     36. | BE0664799594           Greece    2    3   .4444444 |
     37. | BE0664799594           Greece    2    3   .4444444 |
     38. | BE0664799594   United Kingdom    1    3   .4444444 |
         |----------------------------------------------------|
     39. | BE0808687713          Denmark    5    5          0 |
     40. | BE0808687713          Denmark    5    5          0 |
     41. | BE0808687713          Denmark    5    5          0 |
     42. | BE0808687713          Denmark    5    5          0 |
     48. | BE0808687713          Denmark    5    5          0 |
         |----------------------------------------------------|
     49. | BE0810603264          Belgium    4    4          0 |
     50. | BE0810603264          Belgium    4    4          0 |
     51. | BE0810603264          Belgium    4    4          0 |
     52. | BE0810603264          Belgium    4    4          0 |
         |----------------------------------------------------|
     53. | BE0812211682   United Kingdom    2    2          0 |
     54. | BE0812211682   United Kingdom    2    2          0 |
         |----------------------------------------------------|
     55. | BE0818085924          Germany   10   10          0 |
     56. | BE0818085924          Germany   10   10          0 |
     57. | BE0818085924          Germany   10   10          0 |
     58. | BE0818085924          Germany   10   10          0 |
     59. | BE0818085924          Germany   10   10          0 |
     60. | BE0818085924          Germany   10   10          0 |
     61. | BE0818085924          Germany   10   10          0 |
     62. | BE0818085924          Germany   10   10          0 |
     63. | BE0818085924          Germany   10   10          0 |
     64. | BE0818085924          Germany   10   10          0 |
         |----------------------------------------------------|
     67. | BE0822663730    United States    6    6          0 |
     68. | BE0822663730    United States    6    6          0 |
     69. | BE0822663730    United States    6    6          0 |
     71. | BE0822663730    United States    6    6          0 |
     73. | BE0822663730    United States    6    6          0 |
     74. | BE0822663730    United States    6    6          0 |
         |----------------------------------------------------|
     77. | BE0822726086           Cyprus    6    6          0 |
     78. | BE0822726086           Cyprus    6    6          0 |
     79. | BE0822726086           Cyprus    6    6          0 |
     80. | BE0822726086           Cyprus    6    6          0 |
     81. | BE0822726086           Cyprus    6    6          0 |
     82. | BE0822726086           Cyprus    6    6          0 |
         |----------------------------------------------------|
     83. | BE0825343207   United Kingdom    3    6         .5 |
     84. | BE0825343207   United Kingdom    3    6         .5 |
     85. | BE0825343207   United Kingdom    3    6         .5 |
     86. | BE0825343207    United States    3    6         .5 |
     87. | BE0825343207    United States    3    6         .5 |
     88. | BE0825343207    United States    3    6         .5 |
         |----------------------------------------------------|
     89. | BE0830917737           France    4    4          0 |
     90. | BE0830917737           France    4    4          0 |
     91. | BE0830917737           France    4    4          0 |
     92. | BE0830917737           France    4    4          0 |
         |----------------------------------------------------|
     93. | BE0835023906    United States    7    7          0 |
     94. | BE0835023906    United States    7    7          0 |
     95. | BE0835023906    United States    7    7          0 |
     96. | BE0835023906    United States    7    7          0 |
     97. | BE0835023906    United States    7    7          0 |
     98. | BE0835023906    United States    7    7          0 |
     99. | BE0835023906    United States    7    7          0 |
         +----------------------------------------------------+

    Comment


    • #3
      Nick Cox, thank you for your quick reply and help. Also thank you for summarizing my problem more concisely (
      calculate Blau for each company and each year, taking account only of employees present in that year)!

      I will look more closely at your solution/code to understand all the parts, but based of the above list it is indeed providing the desired result!

      Comment


      • #4
        Nick Cox : Side note: I was getting the error that variables were already defined, I adapted the code slightly adding `y' to n and N. This seems to solve the problem and comparing the results to your list provides the same results for year 16.

        Hopefully this was the correct adaptation to make.


        Code:
        egen tag = tag(BvdIdNumber Nationality)
        gen p = .
         
        quietly forval y = 0/16 {
        bys BvdIdNumber Nationality: egen n  = total(year`y')
        by BvdIdNumber : egen N = total(year`y')
        replace p = n`y'/N`y' 
        
        by BvdIdNumber : egen Blau`y' = total(tag * p^2)
        replace Blau`y' = 1 - Blau`y'
        }
        to

        Code:
        egen tag = tag(BvdIdNumber Nationality)
        gen p = .
         
        quietly forval y = 0/16 {
        bys BvdIdNumber Nationality: egen n`y'  = total(year`y')
        by BvdIdNumber : egen N`y'  = total(year`y')
        replace p = n`y'/N`y' 
        
        by BvdIdNumber : egen Blau`y' = total(tag * p^2)
        replace Blau`y' = 1 - Blau`y'
        }

        Code:
         list BvdIdNumber Nationality n16 N16 Blau16 if year16 in 1/99, sepby(Bvd)
        
             +------------------------------------------------------+
             |  BvdIdNumber      Nationality   n16   N16     Blau16 |
             |------------------------------------------------------|
          2. | AT9070422953          Austria     1     1          0 |
             |------------------------------------------------------|
          3. | BE0501515635          Belgium     1     1          0 |
             |------------------------------------------------------|
          5. | BE0501562254          Belgium     6     6          0 |
          6. | BE0501562254          Belgium     6     6          0 |
          7. | BE0501562254          Belgium     6     6          0 |
          8. | BE0501562254          Belgium     6     6          0 |
          9. | BE0501562254          Belgium     6     6          0 |
         10. | BE0501562254          Belgium     6     6          0 |
             |------------------------------------------------------|
         11. | BE0505804718          Germany     2     2          0 |
         12. | BE0505804718          Germany     2     2          0 |
             |------------------------------------------------------|
         17. | BE0518917237          Belgium     3     3          0 |
         18. | BE0518917237          Belgium     3     3          0 |
         19. | BE0518917237          Belgium     3     3          0 |
             |------------------------------------------------------|
         21. | BE0536436625           Canada     2     2          0 |
         22. | BE0536436625           Canada     2     2          0 |
             |------------------------------------------------------|
         28. | BE0555980046           France     1     1          0 |
             |------------------------------------------------------|
         30. | BE0656918246         Bulgaria     4     4          0 |
         31. | BE0656918246         Bulgaria     4     4          0 |
         32. | BE0656918246         Bulgaria     4     4          0 |
         33. | BE0656918246         Bulgaria     4     4          0 |
             |------------------------------------------------------|
         34. | BE0657890622           France     2     2          0 |
         35. | BE0657890622           France     2     2          0 |
             |------------------------------------------------------|
         36. | BE0664799594           Greece     2     3   .4444444 |
         37. | BE0664799594           Greece     2     3   .4444444 |
         38. | BE0664799594   United Kingdom     1     3   .4444444 |
             |------------------------------------------------------|
         39. | BE0808687713          Denmark     5     5          0 |
         40. | BE0808687713          Denmark     5     5          0 |
         41. | BE0808687713          Denmark     5     5          0 |
         42. | BE0808687713          Denmark     5     5          0 |
         46. | BE0808687713          Denmark     5     5          0 |
             |------------------------------------------------------|
         49. | BE0810603264          Belgium     4     4          0 |
         50. | BE0810603264          Belgium     4     4          0 |
         51. | BE0810603264          Belgium     4     4          0 |
         52. | BE0810603264          Belgium     4     4          0 |
             |------------------------------------------------------|
         53. | BE0812211682   United Kingdom     2     2          0 |
         54. | BE0812211682   United Kingdom     2     2          0 |
             |------------------------------------------------------|
         55. | BE0818085924          Germany    10    10          0 |
         56. | BE0818085924          Germany    10    10          0 |
         57. | BE0818085924          Germany    10    10          0 |
         58. | BE0818085924          Germany    10    10          0 |
         59. | BE0818085924          Germany    10    10          0 |
         60. | BE0818085924          Germany    10    10          0 |
         61. | BE0818085924          Germany    10    10          0 |
         62. | BE0818085924          Germany    10    10          0 |
         63. | BE0818085924          Germany    10    10          0 |
         64. | BE0818085924          Germany    10    10          0 |
             |------------------------------------------------------|
         69. | BE0822663730    United States     6     6          0 |
         71. | BE0822663730    United States     6     6          0 |
         72. | BE0822663730    United States     6     6          0 |
         73. | BE0822663730    United States     6     6          0 |
         75. | BE0822663730    United States     6     6          0 |
         76. | BE0822663730    United States     6     6          0 |
             |------------------------------------------------------|
         77. | BE0822726086           Cyprus     6     6          0 |
         78. | BE0822726086           Cyprus     6     6          0 |
         79. | BE0822726086           Cyprus     6     6          0 |
         80. | BE0822726086           Cyprus     6     6          0 |
         81. | BE0822726086           Cyprus     6     6          0 |
         82. | BE0822726086           Cyprus     6     6          0 |
             |------------------------------------------------------|
         83. | BE0825343207   United Kingdom     3     6         .5 |
         84. | BE0825343207   United Kingdom     3     6         .5 |
         85. | BE0825343207   United Kingdom     3     6         .5 |
         86. | BE0825343207    United States     3     6         .5 |
         87. | BE0825343207    United States     3     6         .5 |
         88. | BE0825343207    United States     3     6         .5 |
             |------------------------------------------------------|
         89. | BE0830917737           France     4     4          0 |
         90. | BE0830917737           France     4     4          0 |
         91. | BE0830917737           France     4     4          0 |
         92. | BE0830917737           France     4     4          0 |
             |------------------------------------------------------|
         93. | BE0835023906    United States     7     7          0 |
         94. | BE0835023906    United States     7     7          0 |
         95. | BE0835023906    United States     7     7          0 |
         96. | BE0835023906    United States     7     7          0 |
         97. | BE0835023906    United States     7     7          0 |
         98. | BE0835023906    United States     7     7          0 |
         99. | BE0835023906    United States     7     7          0
        ​​​​​​​
        Best regards,
        Laura

        Comment


        • #5
          It works that way. In my code there was a command within the loop

          Code:
          if `y' < 16 drop n N
          that you didn't copy. That's why my code worked for me.

          In general, fine.

          Comment


          • #6
            My apologies, I wrongly interpreted that part of the code and therefore did not copy it. Thanks for your help!

            Comment


            • #7
              No need to apologise. It is a subtle detail easily overlooked.

              Comment

              Working...
              X