Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Computing share of one variable according to another variable

    Hey Hey,

    I have two variables:
    Varieble1: "nacional": nacional==1 for natives & nacional==0 for non-natives.
    Variable2: "work_col": work_col==1 for white collar workers & work_col=2 for Blue-collar workers.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(nacioanal work_col)
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    0 2
    1 2
    1 2
    0 2
    0 2
    1 2
    1 2
    1 2
    1 2
    1 2
    0 2
    1 2
    0 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    0 2
    1 2
    1 2
    1 2
    1 1
    1 2
    1 2
    1 2
    1 2
    1 2
    0 2
    0 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    0 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    1 2
    end



    I need to compute the share of blue-collar foreign workers.
    I would appreciate any ideas.

    Paris

  • #2
    Everyone is blue-collar in your example, but the proportion of blue-collar workers is just the mean of a (0, 1) indicator that you can compute on the fly.


    Code:
    egen wanted = mean(work_col == 2), by(nacional)
    If you have any missing values as well, the solution is just a twist on that:


    Code:
    egen wanted = mean(cond(inlist(work_col, 1, 2), work_col == 2, .)), by(nacional)

    Incidentally, this underlines that (1, 2) indicators are more awkward to deal with than you want. Always use (0, 1) indicators if you can.
    https://www.stata-journal.com/articl...article=dm0099 is just one repetition of a long-told story.


    Code:
    egen wanted = mean(work_col - 1), by(nacional)
    should work too, either way.

    Last edited by Nick Cox; 06 Jan 2023, 09:10.

    Comment


    • #3
      Dear Nick,

      Thank you so much.
      Both work.

      Dataex displays only 100 obs, while there are around 200,000 obs. So it is not easy to make assumptions in this case. Thank you again.

      Comment


      • #4
        I didn't need to use your data example because looking at it was enough, so thanks. But the problem -- if there were a problem -- is easy to solve, Run dataex twice and edit results. together.


        Code:
        dataex nacional work_col if work_col == 1, count(50) 
        
        dataex nacional work_col if work_col == 2,  count(50)

        Comment


        • #5
          Ah, I realized that these codes show what percent of non-natives occupy production and non-production occupations. While I seek the share of foreigners in blue-collar and white-collar jobs all together with natives. I mean, imaging there are 100 blue-collar workers, 80 are home-born workers and the rest are non-natives. The share of foreigners is 20%. Now, how can I do that for my dataset?

          Comment


          • #6
            Just swap the variables around.


            Code:
             
             egen wanted = mean(nacional), by(work_col)  tab work_col, summarize(nacional)

            Comment


            • #7
              Code:
              . tab work_col, summarize(nacioanal)
              
                          |        Summary of nacioanal
                 work_col |        Mean   Std. dev.       Freq.
              ------------+------------------------------------
                        1 |   .97037037   .17019483         135
                        2 |   .91534392    .2791086         189
              ------------+------------------------------------
                    Total |    .9382716   .24103384         324

              So, I can say the share of immigrants in non-production occupations is 0.03 ( 100- 0.97) & while in the case of production occupations is 0.09.

              Comment


              • #8
                I don't follow your results. You say you have 200,000 observations, so most of them have disappeared. Otherwise, yes, except that 1 - 0.91534392 rounds to 0.08, not 0.09.

                Comment


                • #9
                  The result is just for one year (324 obs). The period is 14 years.


                  At this stage, I need to determine the share of immigrants in the Top 10 and Bottom 10 occupations. I used

                  Code:
                  extremes  sk_ratio CCPCodes, n(10)
                  to make the top and bottom obs. How shall I compute the share of immigrants in the Top 10 and Bottom 10? I know some ugly ways i.e transferring obs to Excell and ...though I seek something beautiful.
                  Last edited by Paris Rira; 06 Jan 2023, 11:23.

                  Comment


                  • #10
                    https://journals.sagepub.com/doi/pdf...6867X221106436 gives guidance on your main question.

                    Comment


                    • #11
                      Thank you for the article. My question is a bit more complex than the article's examples. I have 5 variables: nacional , nacional=1 for natives , nacional=0 for non-natives, FirmsID: NPC_FIC, Establsihmenti;s ID: ESTAB_ID, Occupation codes: CCPCodes and skill ratio: sk_ratio.
                      Code:
                      * Example generated by -dataex-. For more info, type help dataex
                      clear
                      input float nacioanal double(NPC_FIC ESTAB_ID) float CCPCodes double sk_ratio
                      1 500988754       100861 1114 4.359499
                      1 501950749       893218 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501951290       125210 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 502000740   -820100050 1114 4.359499
                      0 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501947288       101383 1114 4.359499
                      1 501947288       101383 1114 4.359499
                      1 501953784       132785 1114 4.359499
                      1 500988754       100865 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501951290       125205 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501963591       812374 1114 4.359499
                      1 503201835       917490 1114 4.359499
                      1 501947288       101383 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501951290       125205 1114 4.359499
                      1 501951290       125220 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100885 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      0 501951541       126303 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 502001545 -8.16353e+10 1114 4.359499
                      1 500988754       100865 1114 4.359499
                      1 501947288       101394 1114 4.359499
                      1 500988754       100865 1114 4.359499
                      1 501947288       101389 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501947288       101404 1114 4.359499
                      1 501947288       101383 1114 4.359499
                      1 501997479       843688 1114 4.359499
                      1 501951290       125205 1114 4.359499
                      1 501986650       775951 1114 4.359499
                      0 500988754       100861 1114 4.359499
                      1 501947288       101390 1114 4.359499
                      1 501951290       125217 1114 4.359499
                      1 501951194       124911 1114 4.359499
                      1 502286024       838214 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501951290       125225 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100884 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100865 1114 4.359499
                      1 501947288       101400 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501951591       126434 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 502000740   -820100100 1114 4.359499
                      1 502001184 -8.07435e+10 1114 4.359499
                      1 500988754       100865 1114 4.359499
                      1 501947288       101383 1114 4.359499
                      1 501947288       101383 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 502482282       969035 1114 4.359499
                      1 501968846       175241 1114 4.359499
                      1 501989791       260870 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501951290       125208 1114 4.359499
                      1 501951290       125205 1114 4.359499
                      1 501965291       704256 1114 4.359499
                      1 502000740   -820100010 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501947288       101404 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501954125       133626 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501947288       101383 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100865 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100870 1114 4.359499
                      1 500988754       100865 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501985537       227600 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      0 500988754       100861 1114 4.359499
                      1 502001534 -8.16007e+10 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501957595       827745 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100870 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501947288       101396 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 502000740   -820100010 1114 4.359499
                      1 501949737       430827 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501947288       101402 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 502001285  -8075960030 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500297218       897639 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100865 1114 4.359499
                      1 501952505       129549 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 502000467   -587940000 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100865 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501947288       101383 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501951290       125205 1114 4.359499
                      1 501955955       139451 1114 4.359499
                      1 501955220       137252 1114 4.359499
                      1 501947288       101394 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501989902       262016 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 503200883       856836 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501951290       125207 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 500988754       100861 1114 4.359499
                      1 501961746       157368 1114 4.359499
                      end
                      ------------------ copy up to and including the previous line ------------------

                      Listed 150 out of 2254593 observations

                      I am going to compute the share of immigrants in Top 10 and Bottom 10 occupations. Since each firm has more than one establishment more often, it makes drama...
                      Is there anyone to save me?

                      Comment

                      Working...
                      X