Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Many points are overlapped, how to better plot the scatter?

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float a double b str54 province
     51.04702 4034.96 "上海"   
      5.54454 1855.74 "云南"   
      1.85683 1379.31 "内蒙古"
    121.60917 2174.46 "北京"   
      8.06213 1660.91 "吉林"   
     35.26951  3711.6 "四川"   
     13.07172 1450.06 "天津"   
       .89793  241.49 "宁夏"   
     11.22339 2908.58 "安徽"   
     31.98837  7662.1 "山东"   
      8.03617 1506.78 "山西"   
     65.70131 8464.31 "广东"   
      2.13211 1953.27 "广西"   
      2.59514 1168.55 "新疆"   
     46.39749 7697.82 "江苏"   
      6.23167 1853.65 "江西"   
     14.67606 4569.19 "河北"   
     13.90384  4576.1 "河南"   
     12.51545 5364.89 "浙江"   
      1.74926  471.23 "海南"   
     28.83279 3857.99 "湖北"   
     13.33248 3214.54 "湖南"   
      7.51159  931.98 "甘肃"   
      9.86994    3628 "福建"   
       .09442  105.61 "西藏"   
       2.9809  911.86 "贵州"   
     29.94205 4171.69 "辽宁"   
      6.53745 1479.71 "重庆"   
     31.76894 1487.61 "陕西"   
       .83581  238.39 "青海"   
     14.31982  2897.4 "黑龙江"
    end
    twoway (scatter  a b ,mlabel(province) mc(%20)  legend(off) )  (function y=0.01*x ,range(b))
    there is a outlier, casuing many points are overlapped. How to solve this? Thanks.

  • #2
    Consider using logarithmic scale.

    Comment


    • #3
      Some other tips: just leaving out Beijing already helps with the compressed look of the graph.

      You can also move the labels around a bit, which unfortunately is a mostly manual process. mvlabpos can help a bit:
      Code:
      ssc install egenmore
      gen clock=3
      replace clock = 6 if province =="湖北"
      replace clock = 1 if province =="河北"
      replace clock = 4 if province =="河南"
      twoway (scatter  a b if a<100, mlabel(province) mlabvpos(clock) mc(%20)  legend(off))  (function y=0.01*x ,range(b))
      And continue to edit clock positions until you have a decent enough graph.
      Note that mlabvpos can also automate generating clock positions with e.g.:
      Code:
      egen clock = mlabvpos(a b)
      But this is not a ton of help, and you will still need to manually edit label positions (i.e., values of clock) afterwards.

      Comment


      • #4
        Originally posted by Jorrit Gosens View Post
        Some other tips: just leaving out Beijing already helps with the compressed look of the graph.

        You can also move the labels around a bit, which unfortunately is a mostly manual process. mvlabpos can help a bit:
        Code:
        ssc install egenmore
        gen clock=3
        replace clock = 6 if province =="湖北"
        replace clock = 1 if province =="河北"
        replace clock = 4 if province =="河南"
        twoway (scatter a b if a<100, mlabel(province) mlabvpos(clock) mc(%20) legend(off)) (function y=0.01*x ,range(b))
        And continue to edit clock positions until you have a decent enough graph.
        Note that mlabvpos can also automate generating clock positions with e.g.:
        Code:
        egen clock = mlabvpos(a b)
        But this is not a ton of help, and you will still need to manually edit label positions (i.e., values of clock) afterwards.
        Thanks for the tip

        Comment


        • #5
          mlabvpos() is from egenmore (SSC).

          Comment


          • #6
            Setting aside the question of marker labels, I get this with some help from niceloglabels (Stata Journal)


            Code:
            niceloglabels a, style(13) local(yla) 
            niceloglabels b, style(13) local(xla) 
            twoway scatter  a b , ysc(log) xsc(log)  yla(`yla', ang(h)) xla(100 10000 `xla') scheme(s1color) ms(Oh)

            Click image for larger version

Name:	chinascatter.png
Views:	1
Size:	19.8 KB
ID:	1516226

            Comment

            Working...
            X