Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help in data labels in a scatterplot

    Hi all,

    I have problems with adjusting the position of the data labels. As seen in the scatterplot below, the data labels are pretty messy at the bottom:


    Graph_data labels_forum.gph



    This is the code I used for the chart and the data:

    Code:
    correlate Overall_digital Trust_safety  if  !inlist(Country,"Hong Kong", "China") // 2018 and 2019
    local rho: display %3.2f r(rho)
    sum Trust_safety if Year == 2019
    local min = r(min)
    local max = r(max)
    
    #delimit ;
    twoway             (scatter Overall_digital Trust_safety if  Country!= "Hong Kong") ||
                    (scatter Overall_digital Trust_safety if  inlist(Country,"China", "India", "Indonesia", "Japan", "Malaysia", "Philippines")
                        | inlist(Country, "Singapore", "ROK", "Thailand", "Vietnam"), msymbol(i) mlabcolor(red) mlabel(Country))||
                    (lfit Overall_digital Trust_safety if  !inlist(Country,"Hong Kong","China"), range(40 75))
                    , note(Note: Line of best fit and correlation coefficient exclude China and Hong Kong)
                    text(0 35 "{&rho} = `rho'")
                    ytitle(Overall Digital Index score)
                    xtitle(Trust & Safety score)
                    legend(off)
                    scheme(s2color)
                    ;
    #delimit cr
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double(Overall_digital Trust_safety) int Year str12 Country
     1.4911944748969637 62.1 2018 "China"      
     1.1213408324666971    . 2018 "Hong Kong"  
     -.9354383366385858 65.1 2018 "India"      
     -1.170759219326978 65.9 2018 "Indonesia"  
      .4900689245217592 47.9 2018 "Japan"      
      .0743814016041916 56.5 2018 "Malaysia"  
     -.9019645756671696 56.7 2018 "Philippines"
     1.3019748954571109 49.8 2018 "Singapore"  
        .59285067651566 47.7 2018 "ROK"        
     -.8939165934565277 35.1 2018 "Thailand"  
    -1.1697324803731215 67.4 2018 "Vietnam"    
     1.4633771560661124 73.2 2019 "China"      
     1.3356631016510268    . 2019 "Hong Kong"  
     -.5816092928110765 70.7 2019 "India"      
    -1.1366984242244103 69.5 2019 "Indonesia"  
      .8734172000204578 44.1 2019 "Japan"      
     .21963797533267906 63.6 2019 "Malaysia"  
     -.8669008987190485   56 2019 "Philippines"
      1.417392252169481 52.6 2019 "Singapore"  
      .9159432918807312 46.2 2019 "ROK"        
    -1.0820484226481373 57.3 2019 "Thailand"  
     -.9675688685359461 66.5 2019 "Vietnam"    
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
                      .    .    . ""          
    end
    Is there a way to adjust the data labels without the graph editor? Additionally, is there a way to adjust the data labels such that the year could be reflected too? For example, the label "Thailand" should be "Thailand, 2018" in the scatter plot.

    Thanks!
    Last edited by Wee Yang Ng; 10 Mar 2023, 02:31.

  • #2
    See https://journals.sagepub.com/doi/pdf...867X0500500412. For including the year, look at the -concat- function of egen.

    Code:
    help egen
    You may want to use the three letter country ISO codes instead of full names and include only the last 2 digits of the year as all years are in the 21st century, e.g., "CHN18" instead of "China, 2018". Plotting the labels in place of markers and labels as outlined in the linked article may work out well. Otherwise, also see the available marker label options.

    Code:
    help scatter##marker_label_options

    Comment


    • #3
      I agree with Andrew Musau. The best single step to reducing clutter is to use three-letter abbreviations (TLAs!) for country names. Beyond that,

      1. Often the marker labels alone are enough. So suppress the marker symbols completely but ensure the marker labels are centred where the symbols would have been, at mlabpos(0)

      2. Sometimes you need to use xscale() to stretch the range of the x axis if the marker label spills over the edge of the graph.

      3. Different years might be represented in different colours.

      Comment

      Working...
      X