Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Scatterplot with weights

    Dear all,

    I have been struggling to get the graph I want.
    Currently, I am trying to plot automation potential and relative wage by occupation (isco1d is the occupation indicator).

    My data is by country isco1d.
    So in the scatterplot, one dot is one country-occupation pair.

    Now, what I want is to have a graph where the sizes of the "dots" in the scatterplot depends on the employment share within each country.
    So far, below is the code I've been struggling with.

    Using this code, what I get is the sizes depending on the employment share within each isco1d. I'm assuming it's because of the if command.
    But as mentioned above, I want the sizes of the dot to be comparable across all country-isco1d pairs; and not only within occupation.

    Code:
            tw (scatter arntz_auto relwage [w=nsh] if isco1d==1, xsc(log) msymbol(circle) mfc(none)) ///
               (scatter arntz_auto relwage [w=nsh] if isco1d==2, xsc(log) msymbol(square) mfc(none)) ///
               (scatter arntz_auto relwage [w=nsh] if isco1d==3, xsc(log) msymbol(diamond) mfc(none)) ///
               (scatter arntz_auto relwage [w=nsh] if isco1d==4, xsc(log) msymbol(cross) mfc(none)) ///
               (scatter arntz_auto relwage [w=nsh] if isco1d==5, xsc(log) msymbol(triangle) mfc(none)) ///
               (scatter arntz_auto relwage [w=nsh] if isco1d==6, xsc(log) msymbol(circle) mfc(none)) ///
               (scatter arntz_auto relwage [w=nsh] if isco1d==7, xsc(log) msymbol(square) mfc(none)) ///
               (scatter arntz_auto relwage [w=nsh] if isco1d==8, xsc(log) msymbol(diamond) mfc(none)) ///
               (scatter arntz_auto relwage [w=nsh] if isco1d==9, xsc(log) msymbol(cross) mfc(none)), ///
                legend(order(1 2 3 4 5 7 8 9) label(1 "Manager") label(2 "Professionals") label(3 "Technicians") label(4 "Clerks") label(5 "Services and Sales") label(7 "Craft and Trades") label(8 "Plant and Machine") label(9 "Elementary")) ///
                ytitle("Automation Potential") xtitle("Relative Wage")
    How can I do this?
    Many thanks in advance!

  • #2
    Consider this code, where the key trick is to do everything with a single call to scatter, using the separate command beforehand to create the necessary variables.

    Code:
    set scheme stcolor
    sysuse auto, clear
    
    separate trunk, by(foreign) veryshortlabel
    
    gen rep78_2 = rep78
    replace rep78_2 = 20*rep78 if foreign == 1
    
    twoway  (scatter trunk price [w=rep78_2] if foreign == 0 , msymbol(circle) mfc(none)) ///
            (scatter trunk price [w=rep78_2] if foreign == 1 , msymbol(square) mfc(none)) ///
            , legend(label(1 "Domestic") label(2 "Foreign")) ///
            name(diff_scales, replace) title("Marker sizes have different scales") ///
    
    twoway scatter trunk0 trunk1 price [w=rep78_2] , msymbol(circle square) mfc(none ...) ///
            , name(same_scales, replace) title("Marker sizes are on the same scale")
    which produces these graphs:
    Click image for larger version

Name:	diff_scales.png
Views:	1
Size:	102.3 KB
ID:	1717031

    Click image for larger version

Name:	same_scales.png
Views:	1
Size:	87.6 KB
ID:	1717029

    Last edited by Hemanshu Kumar; 13 Jun 2023, 08:50.

    Comment


    • #3
      * ignore this post *
      Last edited by Hemanshu Kumar; 13 Jun 2023, 08:49.

      Comment


      • #4
        Thanks for this.

        I get the separate command.
        What I don't understand is:

        Code:
        gen rep78_2 = rep78
        replace rep78_2 = 20*rep78 if foreign == 1
        Why do we multiply the rep78 by 20?
        Also, since I have 9 different groups instead of 2, how will this work?

        Many thanks again in advance.

        Comment


        • #5
          Originally posted by Wonhee Cho View Post
          Why do we multiply the rep78 by 20?
          Oh, please ignore that. I just blew up the weights for one of the values of foreign so the contrast between weights on a common scale and on different scales would be very obvious to see.

          Originally posted by Wonhee Cho View Post
          Also, since I have 9 different groups instead of 2, how will this work?
          The scatter command can take more than 2 y-variables; 9 will work. You'll end up creating perhaps arntz_auto1 through arntz_auto9, and you'll still just need a single call to scatter.

          Something like:

          Code:
          tw scatter arntz_auto1 - arntz_auto9 relwage [w=nsh] , xsc(log) ///
              msymbol(circle square diamond cross triangle circle square diamond cross) ///
              mfc(none ...) ytitle("Automation Potential") xtitle("Relative Wage")

          Comment


          • #6
            Got it! Thanks so much for all your help.

            Comment

            Working...
            X