Dear all, I am struggling with a graphical bubble representation of my data.
I would like to plot the predicted probabilities for six groups (my x var) by country with a scatterplot, which at the same time should take into account the 'size' of each group (by country) I am considering.
Here the command I use:
The graph I obtain (attached) is quite close to what I would like to obtain, but there is one weird (to me) thing that I do not like. Indeed, in different countries the same value of weight generates very different sizes of the balls (see for instance the right ball for 'DE', whose the weight is 39.8, and the last of 'DK', whose the weight is 40.1: they appear clearly of different size, despite their value of weight being pratically the same).
It seems to me that Stata uses the dispersion of the weight to produce balls of different size: in a country where the different groups (A,B,C,D,E,F) considered have very different weight values, then the balls are in general bigger; in country where these weight values are similar across the different groups, all the balls look smaller. I find this way of visualising the data quite misleading to my purposes.
Do you any clue on how to solve this issue?
Thanks a lot in advance, best, G.P.
Here the data
bubble.gph
I would like to plot the predicted probabilities for six groups (my x var) by country with a scatterplot, which at the same time should take into account the 'size' of each group (by country) I am considering.
Here the command I use:
Code:
la val country country la def country /// 1"DE" /// 2"FR" /// 3"SE" /// 4"DK" /// 5"NO" /// 6"UK" /// 7"IE" /// 8"NL" /// 9"BE" /// 10"IT" /// 11"ES" /// 12"GR" , modify fre country la val or or la def or /// 15 "A" /// 14 "b" /// 13 "C" /// 12 "D" /// 11 "E" /// 10 "F" , modify fre id twoway (scatter margins or [fw=peso ] , msymbol(circle_hollow)) /// (line margins or if or == 10 | or == 12 | or == 14, lcolor (black) ) /// (line margins or if or == 11 | or == 13 | or == 15, lcolor (gs13) ) /// , by(country, col(4) note("")) scheme(s1mono) /// graphregion(margin(zero)) /// /*yline(0, lc(black) lwidth(vthin) lpattern(solid))*/ /// /* xline(13.5 11.5, lc(black) lwidth(vthin) lpattern(dash))*/ /// ylab( , nogrid) /// ylabel(/*-.6(0.3)0.6*/,angle(0)labsize(small)grid) /// xlabel(10(1)15, valuelabel grid labsize(small)) /// legend(rows(1)) /// name(a, replace) graph save "a", replace
The graph I obtain (attached) is quite close to what I would like to obtain, but there is one weird (to me) thing that I do not like. Indeed, in different countries the same value of weight generates very different sizes of the balls (see for instance the right ball for 'DE', whose the weight is 39.8, and the last of 'DK', whose the weight is 40.1: they appear clearly of different size, despite their value of weight being pratically the same).
It seems to me that Stata uses the dispersion of the weight to produce balls of different size: in a country where the different groups (A,B,C,D,E,F) considered have very different weight values, then the balls are in general bigger; in country where these weight values are similar across the different groups, all the balls look smaller. I find this way of visualising the data quite misleading to my purposes.
Do you any clue on how to solve this issue?
Thanks a lot in advance, best, G.P.
Here the data
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input byte or float(pr margins lower upper) byte(country id) float peso str8 var9 10 .6 .5976923 .5635103 .6318744 1 1 5.31 "low-NAT" 11 .51 .5053219 .4365443 .5740996 1 2 24.64 "low-MIG" 12 .74 .7360662 .7261686 .7459637 1 3 50.44 "mid-NAT" 13 .66 .6583158 .6055973 .7110343 1 4 35.03 "mid-MIG" 14 .82 .8172152 .8072185 .827212 1 5 44.15 "high-NAT" 15 .65 .6470902 .5953653 .698815 1 6 39.78 "high-MIG" 10 .6 .601023 .5743551 .6276909 2 7 18.05 "low-NAT" 11 .55 .5517676 .4858773 .6176578 2 8 29.54 "low-MIG" 12 .71 .7145914 .7002927 .7288902 2 9 46.35 "mid-NAT" 13 .61 .613452 .5540551 .6728489 2 10 35.16 "mid-MIG" 14 .79 .7895741 .7733445 .8058037 2 11 35.52 "high-NAT" 15 .64 .6425119 .5816921 .7033318 2 12 35.16 "high-MIG" 10 .75 .748827 .7214509 .7762031 3 13 14.56 "low-NAT" 11 .62 .6180023 .5140577 .7219469 3 14 16.82 "low-MIG" 12 .83 .8290432 .8157801 .8423064 3 15 43.6 "mid-NAT" 13 .69 .6892186 .624153 .7542842 3 16 40.48 "mid-MIG" 14 .91 .9108879 .9007325 .9210434 3 17 41.82 "high-NAT" 15 .81 .8097132 .7571398 .8622867 3 18 41.96 "high-MIG" 10 .7 .7049431 .6761626 .7337236 4 19 14.28 "low-NAT" 11 .34 .3425309 .2333281 .4517336 4 20 30.71 "low-MIG" 12 .79 .7886393 .7726654 .8046131 4 21 37.9 "mid-NAT" 13 .6 .6036703 .4872831 .7200575 4 22 28.63 "mid-MIG" 14 .87 .8668346 .8546618 .8790075 4 23 47.78 "high-NAT" 15 .63 .6278278 .5331815 .7224741 4 24 40.25 "high-MIG" 10 .73 .7337023 .7036204 .7637843 5 25 9.98 "low-NAT" 11 .48 .4844293 .3362609 .6325976 5 26 16.98 "low-MIG" 12 .84 .8406835 .8277858 .8535813 5 27 37.34 "mid-NAT" 13 .7 .6986722 .6015205 .7958239 5 28 35.47 "mid-MIG" 14 .91 .9074479 .8985351 .9163608 5 29 52.59 "high-NAT" 15 .78 .777859 .7008136 .8549044 5 30 47.55 "high-MIG" end label values or or label def or 10 "F", modify label def or 11 "E", modify label def or 12 "D", modify label def or 13 "C", modify label def or 14 "b", modify label def or 15 "A", modify label values country country label def country 1 "DE", modify label def country 2 "FR", modify label def country 3 "SE", modify label def country 4 "DK", modify label def country 5 "NO", modify
Comment