Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by William Lisowski View Post
    A "wide layout" like that is not helpful for most analysis tasks in Stata.
    Thats true, thats why alter I will collapse by (Experiengroup sk_rat).

    Comment


    • #17
      With the data in post #1
      Code:
      // restrict the calculations to immigrants
      drop if nacio=="PT"
      // Clyde's code from post #4
      by Expgroup sk_rat_quartile nacio , sort: gen numerator = _N
      by Expgroup sk_rat_quartile (nacio): gen denominator = _N
      gen share_percent = 100*numerator/denominator
      // collapse
      collapse (first) share=numerator share_percent, by(Expgroup sk_rat_quartile nacio)
      format %9.1f share_percent
      list, sepby(Expgroup sk_rat_quartile) abbreviate(20) noobs
      gives
      Code:
        +------------------------------------------------------------+
        | Expgroup   sk_rat_quartile   nacio   share   share_percent |
        |------------------------------------------------------------|
        |        3                 3      IR       1           100.0 |
        |------------------------------------------------------------|
        |        4                 3      GR       1            50.0 |
        |        4                 3      PR       1            50.0 |
        |------------------------------------------------------------|
        |        5                 3      AO       1            33.3 |
        |        5                 3      SW       1            33.3 |
        |        5                 3      UK       1            33.3 |
        |------------------------------------------------------------|
        |        6                 3      ES       1            33.3 |
        |        6                 3      SP       1            33.3 |
        |        6                 3      US       1            33.3 |
        |------------------------------------------------------------|
        |        7                 3      EU       1            25.0 |
        |        7                 3      SP       1            25.0 |
        |        7                 3      UK       2            50.0 |
        +------------------------------------------------------------+
      Is this what you have in mind?

      Comment


      • #18
        I really appreciated it.
        There are still two problems.
        First, I dont get that Collapse part, why using
        "first"?
        Second, now, I am asking you: whats your definition of the sentence "summing up over countries" ? Because I need to sum up over countries in this stage.

        Comment


        • #19
          First, I dont get that Collapse part, why using "first"?
          To select just one observation of any group with the same Expgroup, sk_rat_quartile, and nacio

          Consider what the data for Expgroup 7 look like before and after the collapse.
          Code:
          . // before collapse
          . list if Expgroup==7, abbreviate(20) noobs
          
            +------------------------------------------------------------------------------+
            | Expgroup   sk_rat_quartile   nacio   numerator   denominator   share_percent |
            |------------------------------------------------------------------------------|
            |        7                 3      EU           1             4            25.0 |
            |        7                 3      SP           1             4            25.0 |
            |        7                 3      UK           2             4            50.0 |
            |        7                 3      UK           2             4            50.0 |
            +------------------------------------------------------------------------------+
          
          . collapse (first) share=numerator share_percent, by(Expgroup sk_rat_quartile nacio)
          
          . // after collapse
          . list if Expgroup==7, abbreviate(20) noobs
          
            +------------------------------------------------------------+
            | Expgroup   sk_rat_quartile   nacio   share   share_percent |
            |------------------------------------------------------------|
            |        7                 3      EU       1            25.0 |
            |        7                 3      SP       1            25.0 |
            |        7                 3      UK       2            50.0 |
            +------------------------------------------------------------+
          whats your definition of the sentence "summing up over countries" ?
          You wrote

          Specifically, predicted immigrants inflows are going to be calculated by multiplying the total number of newly arriving immigrants from the source country lets call that K at time t (I access to this quantity, no need to compute that, fortunately) by the share of immigrants from source country K that was in skill group (Experience group and skill_ratio) ij in the year 1981.
          ...
          After summing up over countries K, the instrument is constructed as the predicted number of immigrants divided by the total number of workers in a given skill group.
          So with that in mind, what we have calculated is not what you need. You do not want the percent by Expgroup and sk_rat_quartile of each country, you want the percent by country for each combination of Expgroup and sk_rat_quartile.
          Code:
          // restrict the calculations to immigrants
          drop if nacio=="PT"
          // Clyde's code modified
          by nacio Expgroup sk_rat_quartile, sort: gen numerator = _N
          by nacio (Expgroup sk_rat_quartile): gen denominator = _N
          gen share_percent = 100*numerator/denominator
          order nacio Expgroup sk_rat_quartile
          format %9.1f share_percent
          // before collapse
          list if nacio=="UK", abbreviate(20) noobs
          collapse (first) share=numerator share_percent, by(nacio Expgroup sk_rat_quartile)
          // after collapse
          list if nacio=="UK", abbreviate(20) noobs
          Code:
          . // before collapse
          . list if nacio=="UK", abbreviate(20) noobs
          
            +------------------------------------------------------------------------------+
            | nacio   Expgroup   sk_rat_quartile   numerator   denominator   share_percent |
            |------------------------------------------------------------------------------|
            |    UK          5                 3           1             3            33.3 |
            |    UK          7                 3           2             3            66.7 |
            |    UK          7                 3           2             3            66.7 |
            +------------------------------------------------------------------------------+
          
          . collapse (first) share=numerator share_percent, by(nacio Expgroup sk_rat_quartile)
          
          . // after collapse
          . list if nacio=="UK", abbreviate(20) noobs
          
            +------------------------------------------------------------+
            | nacio   Expgroup   sk_rat_quartile   share   share_percent |
            |------------------------------------------------------------|
            |    UK          5                 3       1            33.3 |
            |    UK          7                 3       2            66.7 |
            +------------------------------------------------------------+
          So if there were 300 immigrants from nacio UK, 100 of them would be assigned Expgroup 5 sk_rat_quartile 3, and 200 of them would be assigned Expgroup 7 sk_rat_quartile 3,

          Added in edit: We would have reached this point much sooner if you had presented, not just a verbal description of what you wanted, but a worked out example using your three observations of UK data. I know you said "I am going to compute the share of immigrants from the source country that are in skill groups (Expgroup & sk_rat_quartile)." but it was not understood to mean "For this data on immigrants giving their source country (nacio) and skill groups (Expgroup & sk_rat_quartile), I want to compute for each country what percentage is in each combination of skill groups (Expgroup & sk_rat_quartile)."

          By providing and explaining a worked example you give the reader something to check their understanding against.
          Last edited by William Lisowski; 11 Feb 2023, 20:23.

          Comment


          • #20
            Prof William,

            Many thanks to you.

            I am impressed by your unlimited patience. It worked. Thats what I really need.
            Your explanation is so clear and I will be careful with providing precise examples and addressing the core problems. Probably, it is due to my not very good English that my verbal explanation misleads you. Sorry about that.

            Gratefully yours,
            Paris

            Comment


            • #21
              Prof, I tried to run this dataset with code #19, and it gives only 1 for all.
              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input str7 Regions str24 country int immigrants
                       "Centro"         "Spain"          672
                       " Lisboa"         "Spain"         4921
                       "Centro"         "France"         3814
                       " Lisboa"        "France"         1858
              what I am going to do is:

              Share of Spanish immigrants in Metropolitain= (672+4921)/ (672+ 3814)

              Share of French immigrants in metropolitan= 3814+1858) / (3814+672)


              I need to split the region, then combine it. sth like this:
              Code:
              Region                          indexregion             Country      immigrants
              Centro     Lisboa                Centro,Lisboa             Spain          672+4921
              Centro     Lisboa                Centro,Lisboa             Franc           3814+1858
              So I could be able to compute the share of immigrants once
              Could you please assist me?

              Thank you so much.
              Last edited by Paris Rira; 13 Feb 2023, 12:57.

              Comment


              • #22
                Code:
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input str7 Regions str24 country int immigrants
                         "Centro"         "Spain"          672
                         " Lisboa"         "Spain"         4921
                         "Centro"         "France"         3814
                         " Lisboa"        "France"         1858
                end
                // get rid of all leading and trailing blanks
                replace Regions = trim(Regions)
                generate indexregion = Regions
                replace indexregion = "Centro,Lisboa" if inlist(Regions,"Centro","Lisboa")
                collapse (first) indexregion (sum) immigrants, by(country)
                list, clean noobs abbreviate(12)
                Code:
                . list, clean noobs abbreviate(12)
                
                    country     indexregion   immigrants  
                     France   Centro,Lisboa         5672  
                      Spain   Centro,Lisboa         5593  
                
                .

                Comment


                • #23
                  It worked perfectly. It is the best as usual. Thanks a lot, Prof William.

                  Code:
                  replace Regions = trim(Regions)
                  generate indexregion = Regions
                  replace indexregion = "Centro,Lisboa" if inlist(Regions,"Centro","Lisboa")
                  collapse (first) indexregion (sum) imm, by(country)
                  list, clean noobs abbreviate(12)
                  
                  egen n_imm= total(imm)
                  g share_imm= imm/n_imm
                  
                  . list
                  
                  
                       +---------------------------------------------------------------------+
                       |                  country     indexregion     imm   n_imm   share_~m |
                       |---------------------------------------------------------------------|
                    1. |                   Angola   Centro,Lisboa   11131   58921    .188914 |
                    2. |                   Brazil   Centro,Lisboa    5009   58921   .0850121 |
                    3. |               Cape Verde   Centro,Lisboa   17312   58921   .2938171 |
                    4. |                    China   Centro,Lisboa     125   58921   .0021215 |
                    5. |                   France   Centro,Lisboa    5672   58921   .0962645 |
                       |---------------------------------------------------------------------|
                    6. |            Guinea Bissau   Centro,Lisboa     918   58921   .0155802 |
                    7. |                    India   Centro,Lisboa     189   58921   .0032077 |
                    8. |                    Italy   Centro,Lisboa     525   58921   .0089102 |
                    9. |               Mozambique   Centro,Lisboa    2658   58921   .0451113 |
                   10. |  Other African countries   Centro,Lisboa    1300   58921   .0220634 |
                       |---------------------------------------------------------------------|
                   11. | Other American countries   Centro,Lisboa     564   58921   .0095721 |
                   12. |    Other Asian countries   Centro,Lisboa     863   58921   .0146467 |
                   13. | Other European countries   Centro,Lisboa    4171   58921   .0707897 |
                   14. |    Sao Tome and Principe   Centro,Lisboa    1406   58921   .0238625 |
                   15. |                    Spain   Centro,Lisboa    5593   58921   .0949237 |
                       |---------------------------------------------------------------------|
                   16. |           United Kingdom   Centro,Lisboa    1485   58921   .0252032 |
                       +---------------------------------------------------------------------+
                  Gratefully yours,
                  Paris

                  Comment


                  • #24
                    The code in posts #22 and #23 will not work correctly if the input data contains Regions other than Centro and Lisboa. This code corrects the problem.
                    Code:
                    * Example generated by -dataex-. For more info, type help dataex
                    clear
                    input str7 Regions str24 country int immigrants
                             "Centro"         "Spain"          672
                             " Lisboa"         "Spain"         4921
                             "Centro"         "France"         3814
                             " Lisboa"        "France"         1858
                    "Oporto" "Spain" 300
                    "Oporto" "France" 100
                    end
                    // get rid of all leading and trailing blanks
                    replace Regions = trim(Regions)
                    generate indexregion = Regions
                    replace indexregion = "Centro,Lisboa" if inlist(Regions,"Centro","Lisboa")
                    collapse (sum) immigrants, by(indexregion country)
                    order indexregion country
                    list, clean noobs abbreviate(12)
                    
                    bysort indexregion (country): egen n_imm= total(immigrants)
                    generate share_imm= immigrants/n_imm
                    list, clean noobs abbreviate(12)
                    Code:
                    . list, clean noobs abbreviate(12)
                    
                          indexregion   country   immigrants  
                        Centro,Lisboa    France         5672  
                        Centro,Lisboa     Spain         5593  
                               Oporto    France          100  
                               Oporto     Spain          300
                    Code:
                    . list, clean noobs abbreviate(12)
                    
                          indexregion   country   immigrants   n_imm   share_imm  
                        Centro,Lisboa    France         5672   11265    .5035064  
                        Centro,Lisboa     Spain         5593   11265    .4964936  
                               Oporto    France          100     400         .25  
                               Oporto     Spain          300     400         .75

                    Comment


                    • #25
                      Actually, there are only Centro and Lisboa in the Region. But, appreciated for pointing out that.

                      Comment

                      Working...
                      X