Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Share of one variables separately

    Good afternoon Dear Statalists,

    I am going to compute the share of immigrants from the source country that are in skill groups (Expgroup & sk_rat_quartile).
    To be more clear there is a variable "nacio" when PT is native when anything else is immigrant (!="PT")
    I am going to obtain the share of immigrants, not natives, based on source countries. For instance, I need to know whats the share of English people, here shows with the UK in "nacio" variable. The share immigrants separately by their own country is required. Any ideas really appreciated.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float Expgroup byte sk_rat_quartile str2 nacio
    2 3 "PT"
    5 3 "PT"
    7 3 "UK"
    7 3 "PT"
    7 3 "EU"
    7 3 "PT"
    5 3 "PT"
    4 3 "PT"
    7 3 "PT"
    5 3 "UK"
    2 3 "PT"
    6 3 "PT"
    5 3 "PT"
    4 3 "PT"
    7 3 "PT"
    4 3 "PT"
    8 3 "PT"
    7 3 "PT"
    5 3 "SW"
    5 3 "PT"
    4 3 "PT"
    8 3 "PT"
    6 3 "SP"
    4 3 "PT"
    4 3 "PT"
    5 3 "PT"
    7 3 "PT"
    7 3 "PT"
    2 3 "PT"
    5 3 "PT"
    4 3 "PR"
    3 3 "PT"
    7 3 "PT"
    4 3 "GR"
    8 3 "PT"
    4 3 "PT"
    6 3 "PT"
    2 3 "PT"
    7 3 "PT"
    6 3 "US"
    7 3 "PT"
    8 3 "PT"
    3 3 "IR"
    7 3 "PT"
    6 3 "PT"
    7 3 "SP"
    7 3 "PT"
    5 3 "AO"
    2 3 "PT"
    6 3 "PT"
    5 3 "PT"
    6 3 "ES"
    5 3 "PT"
    2 3 "PT"
    8 3 "PT"
    7 3 "UK"
    7 3 "PT"
    
    end
    Cheers,

    Paris

  • #2
    Hello Paris Rira. I'm not certain I understood your question, but does this give what you want?

    Code:
    . *ssc install fre // Uncomment line to install -fre- if necessary
    . generate byte native = nacio=="PT"
    
    . fre nacio if !native
    
    nacio
    -----------------------------------------------------------
                  |      Freq.    Percent      Valid       Cum.
    --------------+--------------------------------------------
    Valid   AO    |          1       7.69       7.69       7.69
            ES    |          1       7.69       7.69      15.38
            EU    |          1       7.69       7.69      23.08
            GR    |          1       7.69       7.69      30.77
            IR    |          1       7.69       7.69      38.46
            PR    |          1       7.69       7.69      46.15
            SP    |          2      15.38      15.38      61.54
            SW    |          1       7.69       7.69      69.23
            UK    |          3      23.08      23.08      92.31
            US    |          1       7.69       7.69     100.00
            Total |         13     100.00     100.00           
    -----------------------------------------------------------
    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 18.5 (Windows)

    Comment


    • #3
      Hi Bruce,
      Thank you for getting back to me.
      What I seek is sth like below:

      gen ( immigrant_share_UK)skill group I and Expegroup J= (English_people)IJ/ (Total immigrants (immigrants of all nationalities))IJ

      So how can I translate this to Stata codes?
      Last edited by Paris Rira; 11 Feb 2023, 12:35.

      Comment


      • #4
        Code:
        by Expgroup sk_rat_quartile nacio, sort: gen numerator = _N
        by Expgroup sk_rat_quartile (nacio): gen denominator = _N
        gen share_percent = 100*numerator/denominator

        Comment


        • #5
          Prof Cylde,

          I need to obtain one by one the share of each foreign people because afterward, I will sum up all. Your code makes over all share I guess, it does not address the share of each country/nationality.

          Comment


          • #6
            Yes it does. Look at, for example the results for Expgroupo 4 sk_rat_quartile 3: it is 11.11%% GR, 11.11% PR, and 77.78% PT--there you have it country by country.

            Comment


            • #7
              Code:
              collapse (count) immigrant_share_=Expgroup if nacio!="PT", by(nacio)
              summarize immigrant_share_, meanonly
              replace immigrant_share_ = immigrant_share_/r(sum)
              generate seq = 1
              reshape wide immigrant_share_, i(seq) j(nacio) string
              drop seq
              list, noobs abbreviate(20)
              Code:
              . list, noobs abbreviate(20)
              
                +-----------------------------------------------------------------------------------+
                | immigrant_share_AO | immigrant_share_ES | immigrant_share_EU | immigrant_share_GR |
                |           .0769231 |           .0769231 |           .0769231 |           .0769231 |
                |--------------------+--------------------+--------------------+--------------------|
                | immigrant_share_IR | immigrant_share_PR | immigrant_share_SP | immigrant_share_SW |
                |           .0769231 |           .0769231 |           .1538462 |           .0769231 |
                |-----------------------------------------+-----------------------------------------|
                |           immigrant_share_UK            |           immigrant_share_US            |
                |                     .2307692            |                     .0769231            |
                +-----------------------------------------------------------------------------------+
              Added in edit: this crossed with posts #4-6, from which I realize that post #1 stated the problem as
              I am going to compute the share of immigrants from the source country that are in skill groups (Expgroup & sk_rat_quartile).
              while post #3 stated the problem as
              What I seek is sth like below:

              gen immigrant_share_UK= English_people/ Total immigrants (immigrants of all nationalities)
              My code addressed post #3, which makes no reference to skill groups; Clyde's code addressed post #1.
              Last edited by William Lisowski; 11 Feb 2023, 12:27.

              Comment


              • #8
                Originally posted by Clyde Schechter View Post
                Yes it does. Look at, for example the results for Expgroupo 4 sk_rat_quartile 3: it is 11.11%% GR, 11.11% PR, and 77.78% PT-
                How can I add sth to exclude PT from the group? I only need to obtain non PT.
                Moreover, to collapse by (Expgroup sk_rat_quartile)
                Code:
                collapse (sum)share_percent, by (Expgroup sk_rat_quartile)
                Gives sum, but I need a total number acording to (Expgroup & sk_rat_quartile).

                Comment


                • #9
                  Originally posted by William Lisowski View Post
                  [CODE]

                  while post #3 stated the problem as

                  .
                  Thank you so much Prof William as always your solutions are perfect. It is totally clear and pretty good-looking code. Though I need to determine the share in each skill group, as I edited post #3. Afterward, sum up all.

                  Comment


                  • #10
                    I am afraid that I am not able to understand your description of what you seek.

                    For this tabulation of made-up data, please walk us through what you want the resulting observation(s) for nacio UK to be.

                    Code:
                    Native
                    --------------------------------
                            |     sk_rat_quartile  
                            |    1    2    3   Total
                    --------+-----------------------
                    nacio   |                      
                      PT    |   19   12   13      44
                      Total |   19   12   13      44
                    --------------------------------
                    
                    Immigrant
                    --------------------------------
                            |     sk_rat_quartile  
                            |   1    2    3    Total
                    --------+-----------------------
                    nacio   |                      
                      AO    |        1             1
                      ES    |        1             1
                      EU    |             1        1
                      GR    |        1             1
                      IR    |             1        1
                      PR    |   1                  1
                      SP    |        1    1        2
                      SW    |   1                  1
                      UK    |   1         2        3
                      US    |             1        1
                      Total |   3    4    6       13
                    --------------------------------

                    Comment


                    • #11
                      Here's another way using -levelsof-. I think it gives the result you want.

                      Code:
                      * Make one new variable with immigrant share by nacio
                      generate byte immigrant = nacio!="PT"
                      egen NI = total(immigrant)
                      bysort nacio: generate byte rec1 = _n==1
                      by nacio: generate ishare = _N/NI if immigrant
                      list nacio ishare if rec1
                      * Show that ishare values sum to 1
                      quietly summarize ishare if rec1
                      display "Sum of ishare values = " r(sum)
                      
                      * Make one new variable per country
                      levelsof nacio if immigrant, local(countries)
                      foreach c of local countries {
                          quietly summarize immigrant if nacio=="`c'", meanonly
                          generate ishare_`c' = r(N)/NI
                      }
                      
                      egen isharesum = rowtotal(ishare_AO-ishare_US)
                      list ishare_AO - ishare_US isharesum in 1

                      Output from the first -list- command and the following -display- command:

                      Code:
                      . list nacio ishare if rec1
                      
                           +------------------+
                           | nacio     ishare |
                           |------------------|
                        1. |    AO   .0769231 |
                        2. |    ES   .0769231 |
                        3. |    EU   .0769231 |
                        4. |    GR   .0769231 |
                        5. |    IR   .0769231 |
                           |------------------|
                        6. |    PR   .0769231 |
                        7. |    PT          . |
                       51. |    SP   .1538462 |
                       53. |    SW   .0769231 |
                       54. |    UK   .2307692 |
                           |------------------|
                       57. |    US   .0769231 |
                           +------------------+
                      
                      . quietly summarize ishare if rec1
                      
                      . display "Sum of ishare values = " r(sum)
                      Sum of ishare values = 1

                      Output from the final -list- command:

                      Code:
                      . list ishare_AO - ishare_US isharesum in 1
                      
                           +------------------------------------------------------------------------------------------------------------------------+
                           | ishare~O   ishar~ES   ishare~U   ishar~GR   ishar~IR   ishar~PR   ishare~P   ishare~W   ishare~K   ishar~US   ishare~m |
                           |------------------------------------------------------------------------------------------------------------------------|
                        1. | .0769231   .0769231   .0769231   .0769231   .0769231   .0769231   .1538462   .0769231   .2307692   .0769231          1 |
                           +------------------------------------------------------------------------------------------------------------------------+
                      --
                      Bruce Weaver
                      Email: [email protected]
                      Version: Stata/MP 18.5 (Windows)

                      Comment


                      • #12
                        Sorry, Prof the data lacks "Experience group". Please look at mine.

                        Code:
                        nacio    Expgroup   sk_rat_quartile
                        PR                1             1
                        UK                1             1
                        US                1             1
                        IR                1             1
                        UK                1             1
                        UK                1             1
                        SP                1             1
                        FR                1             1
                        UK                1             1
                        Share of UK people in Experience group one and sk_rat_quartile one = 4/9
                        Share of UK people in Experience group one and sk_rat_quartile one = UK+UK+UK+UK / (PR+UK+US+IR+UK+UK+SP+FR+UK)

                        There are 8 Experience groups and 4 sk_rat_quartile.
                        So, I wish to do so for some million obs as well.

                        Comment


                        • #13
                          And what observations and variables do you wish to create? There are 32 combinations of experience groups and quartiles, and at least 10 countries. Surely you don't intend to create 32 new variables for each country? A "wide layout" like that is not helpful for most analysis tasks in Stata.

                          The problem here is that you have not told us what your ultimate objective is. I'm afraid your questions are based on some part of your idea how to obtain your objective, but more experienced Stata users, if told the objective, would perhaps choose a completely different way to reach it. By answering your questions with no idea of your objective we risk giving you accurate instructions for following a path that in the end will not yield the analysis you want.


                          Comment


                          • #14
                            Prof William, Thank you for the explanation.

                            Well, I am going to make a shift share instrument for the ratio of immigrants to native-born, pioneered by Altonji and Card (1991). Specifically, predicted immigrants inflows are going to be calculated by multiplying the total number of newly arriving immigrants from the source country lets call that K at time t (I access to this quantity, no need to compute that, fortunately) by the share of immigrants from source country K that was in skill group (Experience group and skill_ratio) ij in the year 1981. The target is the red sentence
                            After summing up over countries K, the instrument is constructed as the predicted number of immigrants divided by the total number of workers in a given skill group.
                            That is the whole story.

                            Comment


                            • #15
                              Originally posted by Bruce Weaver View Post
                              Here's another way using -levelsof-. I think it gives the result you want.
                              Thank you so much Prof. But this code still ignores Expergroup and sk_ratio
                              I guess if it could be added some codes that to include the Experience group and Skill groups, will be done. I need to partition the shares according to Expergroups and Skill groups. Beacuse I am going to interpret in this way: i.e. the share of French people among all foreign people is 20 percent or the lowest share belongs to Scaninavine countries etc

                              Comment

                              Working...
                              X