Share of one variables separately

Paris Rira

Join Date: Dec 2022

Posts: 384
#1

Share of one variables separately

11 Feb 2023, 11:13

Good afternoon Dear Statalists,

I am going to compute the share of immigrants from the source country that are in skill groups (Expgroup & sk_rat_quartile).
To be more clear there is a variable "nacio" when PT is native when anything else is immigrant (!="PT")
I am going to obtain the share of immigrants, not natives, based on source countries. For instance, I need to know whats the share of English people, here shows with the UK in "nacio" variable. The share immigrants separately by their own country is required. Any ideas really appreciated.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float Expgroup byte sk_rat_quartile str2 nacio 2 3 "PT" 5 3 "PT" 7 3 "UK" 7 3 "PT" 7 3 "EU" 7 3 "PT" 5 3 "PT" 4 3 "PT" 7 3 "PT" 5 3 "UK" 2 3 "PT" 6 3 "PT" 5 3 "PT" 4 3 "PT" 7 3 "PT" 4 3 "PT" 8 3 "PT" 7 3 "PT" 5 3 "SW" 5 3 "PT" 4 3 "PT" 8 3 "PT" 6 3 "SP" 4 3 "PT" 4 3 "PT" 5 3 "PT" 7 3 "PT" 7 3 "PT" 2 3 "PT" 5 3 "PT" 4 3 "PR" 3 3 "PT" 7 3 "PT" 4 3 "GR" 8 3 "PT" 4 3 "PT" 6 3 "PT" 2 3 "PT" 7 3 "PT" 6 3 "US" 7 3 "PT" 8 3 "PT" 3 3 "IR" 7 3 "PT" 6 3 "PT" 7 3 "SP" 7 3 "PT" 5 3 "AO" 2 3 "PT" 6 3 "PT" 5 3 "PT" 6 3 "ES" 5 3 "PT" 2 3 "PT" 8 3 "PT" 7 3 "UK" 7 3 "PT" end

Cheers,

Paris
Tags: None

Bruce Weaver

Join Date: May 2014
Posts: 1130

11 Feb 2023, 11:57

Hello Paris Rira. I'm not certain I understood your question, but does this give what you want?

Code:

. *ssc install fre // Uncomment line to install -fre- if necessary
. generate byte native = nacio=="PT"

. fre nacio if !native

nacio
-----------------------------------------------------------
              |      Freq.    Percent      Valid       Cum.
--------------+--------------------------------------------
Valid   AO    |          1       7.69       7.69       7.69
        ES    |          1       7.69       7.69      15.38
        EU    |          1       7.69       7.69      23.08
        GR    |          1       7.69       7.69      30.77
        IR    |          1       7.69       7.69      38.46
        PR    |          1       7.69       7.69      46.15
        SP    |          2      15.38      15.38      61.54
        SW    |          1       7.69       7.69      69.23
        UK    |          3      23.08      23.08      92.31
        US    |          1       7.69       7.69     100.00
        Total |         13     100.00     100.00           
-----------------------------------------------------------

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)

Comment

Paris Rira

Join Date: Dec 2022

Posts: 384
#3

11 Feb 2023, 12:03

Hi Bruce,
Thank you for getting back to me.
What I seek is sth like below:

gen ( immigrant_share_UK)skill group I and Expegroup J= (English_people)IJ/ (Total immigrants (immigrants of all nationalities))IJ

So how can I translate this to Stata codes?

Last edited by Paris Rira; 11 Feb 2023, 12:35.
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30084

11 Feb 2023, 12:04

Code:

by Expgroup sk_rat_quartile nacio, sort: gen numerator = _N
by Expgroup sk_rat_quartile (nacio): gen denominator = _N
gen share_percent = 100*numerator/denominator

Comment

Paris Rira

Join Date: Dec 2022

Posts: 384
#5

11 Feb 2023, 12:10

Prof Cylde,

I need to obtain one by one the share of each foreign people because afterward, I will sum up all. Your code makes over all share I guess, it does not address the share of each country/nationality.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30084
#6

11 Feb 2023, 12:13

Yes it does. Look at, for example the results for Expgroupo 4 sk_rat_quartile 3: it is 11.11%% GR, 11.11% PR, and 77.78% PT--there you have it country by country.
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

11 Feb 2023, 12:21

Code:

collapse (count) immigrant_share_=Expgroup if nacio!="PT", by(nacio)
summarize immigrant_share_, meanonly
replace immigrant_share_ = immigrant_share_/r(sum)
generate seq = 1
reshape wide immigrant_share_, i(seq) j(nacio) string
drop seq
list, noobs abbreviate(20)

Code:

. list, noobs abbreviate(20)

  +-----------------------------------------------------------------------------------+
  | immigrant_share_AO | immigrant_share_ES | immigrant_share_EU | immigrant_share_GR |
  |           .0769231 |           .0769231 |           .0769231 |           .0769231 |
  |--------------------+--------------------+--------------------+--------------------|
  | immigrant_share_IR | immigrant_share_PR | immigrant_share_SP | immigrant_share_SW |
  |           .0769231 |           .0769231 |           .1538462 |           .0769231 |
  |-----------------------------------------+-----------------------------------------|
  |           immigrant_share_UK            |           immigrant_share_US            |
  |                     .2307692            |                     .0769231            |
  +-----------------------------------------------------------------------------------+

Added in edit: this crossed with posts #4-6, from which I realize that post #1 stated the problem as

I am going to compute the share of immigrants from the source country that are in skill groups (Expgroup & sk_rat_quartile).

while post #3 stated the problem as

What I seek is sth like below:

gen immigrant_share_UK= English_people/ Total immigrants (immigrants of all nationalities)

My code addressed post #3, which makes no reference to skill groups; Clyde's code addressed post #1.

Last edited by William Lisowski; 11 Feb 2023, 12:27.

Comment

Paris Rira

Join Date: Dec 2022

Posts: 384
#8

11 Feb 2023, 12:23

Originally posted by Clyde Schechter View Post

Yes it does. Look at, for example the results for Expgroupo 4 sk_rat_quartile 3: it is 11.11%% GR, 11.11% PR, and 77.78% PT-

How can I add sth to exclude PT from the group? I only need to obtain non PT.
Moreover, to collapse by (Expgroup sk_rat_quartile)

Code:

collapse (sum)share_percent, by (Expgroup sk_rat_quartile)

Gives sum, but I need a total number acording to (Expgroup & sk_rat_quartile).
Comment
Paris Rira

Join Date: Dec 2022

Posts: 384
#9

11 Feb 2023, 12:39

Originally posted by William Lisowski View Post

[CODE]

while post #3 stated the problem as

.

Thank you so much Prof William as always your solutions are perfect. It is totally clear and pretty good-looking code. Though I need to determine the share in each skill group, as I edited post #3. Afterward, sum up all.
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

#10

11 Feb 2023, 13:09

I am afraid that I am not able to understand your description of what you seek.

For this tabulation of made-up data, please walk us through what you want the resulting observation(s) for nacio UK to be.

Code:

Native
--------------------------------
        |     sk_rat_quartile  
        |    1    2    3   Total
--------+-----------------------
nacio   |                      
  PT    |   19   12   13      44
  Total |   19   12   13      44
--------------------------------

Immigrant
--------------------------------
        |     sk_rat_quartile  
        |   1    2    3    Total
--------+-----------------------
nacio   |                      
  AO    |        1             1
  ES    |        1             1
  EU    |             1        1
  GR    |        1             1
  IR    |             1        1
  PR    |   1                  1
  SP    |        1    1        2
  SW    |   1                  1
  UK    |   1         2        3
  US    |             1        1
  Total |   3    4    6       13
--------------------------------

Comment

Bruce Weaver

Join Date: May 2014
Posts: 1130

#11

11 Feb 2023, 13:12

Here's another way using -levelsof-. I think it gives the result you want.

Code:

* Make one new variable with immigrant share by nacio
generate byte immigrant = nacio!="PT"
egen NI = total(immigrant)
bysort nacio: generate byte rec1 = _n==1
by nacio: generate ishare = _N/NI if immigrant
list nacio ishare if rec1
* Show that ishare values sum to 1
quietly summarize ishare if rec1
display "Sum of ishare values = " r(sum)

* Make one new variable per country
levelsof nacio if immigrant, local(countries)
foreach c of local countries {
    quietly summarize immigrant if nacio=="`c'", meanonly
    generate ishare_`c' = r(N)/NI
}

egen isharesum = rowtotal(ishare_AO-ishare_US)
list ishare_AO - ishare_US isharesum in 1

Output from the first -list- command and the following -display- command:

Code:

. list nacio ishare if rec1

     +------------------+
     | nacio     ishare |
     |------------------|
  1. |    AO   .0769231 |
  2. |    ES   .0769231 |
  3. |    EU   .0769231 |
  4. |    GR   .0769231 |
  5. |    IR   .0769231 |
     |------------------|
  6. |    PR   .0769231 |
  7. |    PT          . |
 51. |    SP   .1538462 |
 53. |    SW   .0769231 |
 54. |    UK   .2307692 |
     |------------------|
 57. |    US   .0769231 |
     +------------------+

. quietly summarize ishare if rec1

. display "Sum of ishare values = " r(sum)
Sum of ishare values = 1

Output from the final -list- command:

Code:

. list ishare_AO - ishare_US isharesum in 1

     +------------------------------------------------------------------------------------------------------------------------+
     | ishare~O   ishar~ES   ishare~U   ishar~GR   ishar~IR   ishar~PR   ishare~P   ishare~W   ishare~K   ishar~US   ishare~m |
     |------------------------------------------------------------------------------------------------------------------------|
  1. | .0769231   .0769231   .0769231   .0769231   .0769231   .0769231   .1538462   .0769231   .2307692   .0769231          1 |
     +------------------------------------------------------------------------------------------------------------------------+

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)

Comment

Paris Rira

Join Date: Dec 2022

Posts: 384
#12

11 Feb 2023, 13:26

Sorry, Prof the data lacks "Experience group". Please look at mine.

Code:

nacio Expgroup sk_rat_quartile PR 1 1 UK 1 1 US 1 1 IR 1 1 UK 1 1 UK 1 1 SP 1 1 FR 1 1 UK 1 1

Share of UK people in Experience group one and sk_rat_quartile one = 4/9
Share of UK people in Experience group one and sk_rat_quartile one = UK+UK+UK+UK / (PR+UK+US+IR+UK+UK+SP+FR+UK)

There are 8 Experience groups and 4 sk_rat_quartile.
So, I wish to do so for some million obs as well.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#13

11 Feb 2023, 13:37

And what observations and variables do you wish to create? There are 32 combinations of experience groups and quartiles, and at least 10 countries. Surely you don't intend to create 32 new variables for each country? A "wide layout" like that is not helpful for most analysis tasks in Stata.

The problem here is that you have not told us what your ultimate objective is. I'm afraid your questions are based on some part of your idea how to obtain your objective, but more experienced Stata users, if told the objective, would perhaps choose a completely different way to reach it. By answering your questions with no idea of your objective we risk giving you accurate instructions for following a path that in the end will not yield the analysis you want.
1 like
Comment
Paris Rira

Join Date: Dec 2022

Posts: 384
#14

11 Feb 2023, 13:49

Prof William, Thank you for the explanation.

Well, I am going to make a shift share instrument for the ratio of immigrants to native-born, pioneered by Altonji and Card (1991). Specifically, predicted immigrants inflows are going to be calculated by multiplying the total number of newly arriving immigrants from the source country lets call that K at time t (I access to this quantity, no need to compute that, fortunately) by the share of immigrants from source country K that was in skill group (Experience group and skill_ratio) ij in the year 1981. The target is the red sentence
After summing up over countries K, the instrument is constructed as the predicted number of immigrants divided by the total number of workers in a given skill group.
That is the whole story.
Comment
Paris Rira

Join Date: Dec 2022

Posts: 384
#15

11 Feb 2023, 14:42

Originally posted by Bruce Weaver View Post

Here's another way using -levelsof-. I think it gives the result you want.

Thank you so much Prof. But this code still ignores Expergroup and sk_ratio
I guess if it could be added some codes that to include the Experience group and Skill groups, will be done. I need to partition the shares according to Expergroups and Skill groups. Beacuse I am going to interpret in this way: i.e. the share of French people among all foreign people is 20 percent or the lowest share belongs to Scaninavine countries etc
Comment

Announcement