Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Conditionally determine -mlabvposition-

    I need some help in order to programmatically avoid overlapping marker labels. I do have a -twoway- graph that combines -scatter- and -lines-. For each year a column with data points is created. For the last year I want to highlight a specific set of organisations by labeling their data points. In this series of plots in some cases data points of organisations are very close by - 8.0 7.9 7.6 7.2-. Now I want to have a good way to assign positions to those organisations.
    First I would generate a variable with the standard position (pos(9)). But now I would need to decide for each organisation within the subsample of organisations which position they should get. So my idea is to rank those within a range of 1 and the highest gets position 10, the second 9, the third 8 and the fourth 7. But of course there could be plots where only three or two labels overlap. And there may be overlapping occuring twice within a graph - example maybe -17 16.5- or so.

    I am not good with the -cond()- function but would it be possible to use this?

    The -scatter- code line is at the moment:
    Code:
    scatter `g' `year' if subsample==1 & year == 3 , mlabel(org) mlabcolor(black) mlabpos(9) msymbol(none)
    My idea is to first identify if a data point of an organisation is part of a common range and if which rank does it have inside this. Is there a good way to create this?

    Code:
    gen mlabvpos = 9
    replace mlabvpos = 10 if org_group_rank==1 & org_group_total == 4
    replace mlabvpos = 9 if org_group_rank==2 & org_group_total == 4
    ...
    And here is an -dataex- example:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    
    input double bio byte year int org byte subsample
     7.6 3   1 1
      .5 3   3 .
     6.6 3   6 .
    35.7 3   8 .
    19.5 3   9 .
     3.3 3  12 .
       8 3  14 1
     7.2 3  15 1
    10.2 3  16 1
     3.8 3  18 .
     1.7 3  21 .
      .3 3  22 .
       4 3  28 .
     1.3 3  29 1
      14 3  30 .
     7.9 3  31 1
    26.8 3  32 .
     7.2 3  35 .
    36.7 3  37 .
    52.5 3  41 .
    11.1 3  43 .
     8.2 3  44 .
    56.9 3  45 .
    17.1 3  47 .
    12.1 3  51 .
     5.9 3  53 .
       1 3  54 .
     3.7 3  55 .
    52.1 3  57 .
     4.5 3  59 .
    23.3 3  61 .
     6.9 3  63 .
       5 3  68 .
     1.4 3  69 .
    19.6 3  71 .
      .4 3  72 .
    18.5 3  73 .
      .3 3  74 .
    42.1 3  75 1
      19 3  77 .
      .5 3  78 .
      .2 3  79 .
      .2 3  80 .
     8.2 3  81 .
    16.4 3  84 .
      .1 3  86 .
    59.7 3  87 .
    43.1 3  88 .
    31.6 3  90 1
     5.6 3  92 .
     9.9 3  94 .
      .1 3  95 1
    12.2 3  97 .
    17.3 3  98 .
     1.8 3  99 .
     6.3 3 100 .
      .3 3 101 1
     4.1 3 102 .
    19.5 3 104 .
      13 3 105 .
      .3 3 111 1
    22.8 3 113 .
    end
    label values year l_year
    label def l_year 3 "2018", modify
    label values org l_org
    label def l_org 1 "A", modify
    label def l_org 3 "Au", modify
    label def l_org 6 "Ba", modify
    label def l_org 8 "Bef", modify
    label def l_org 9 "Beh", modify
    label def l_org 12 "Bet", modify
    label def l_org 14 "Bi", modify
    label def l_org 15 "Bo", modify
    label def l_org 16 "Bou", modify
    label def l_org 18 "Br", modify
    label def l_org 21 "Brj", modify
    label def l_org 22 "Bru", modify
    label def l_org 28 "Da", modify
    label def l_org 29 "Do", modify
    label def l_org 30 "Dr", modify
    label def l_org 31 "Du", modify
    label def l_org 32 "Dü", modify
    label def l_org 35 "Er", modify
    label def l_org 37 "Fr", modify
    label def l_org 41 "Fru", modify
    label def l_org 43 "Gi", modify
    label def l_org 44 "Gr", modify
    label def l_org 45 "Gö", modify
    label def l_org 47 "Ha", modify
    label def l_org 51 "HaU", modify
    label def l_org 53 "HaM", modify
    label def l_org 54 "Hat", modify
    label def l_org 55 "Hau", modify
    label def l_org 57 "He", modify
    label def l_org 59 "Ho", modify
    label def l_org 61 "Je", modify
    label def l_org 63 "Ka", modify
    label def l_org 68 "KaK", modify
    label def l_org 69 "Kas", modify
    label def l_org 71 "Ki", modify
    label def l_org 72 "Ko", modify
    label def l_org 73 "Kon", modify
    label def l_org 74 "Köd", modify
    label def l_org 75 "Kö", modify
    label def l_org 77 "Le", modify
    label def l_org 78 "Lü", modify
    label def l_org 79 "Lün", modify
    label def l_org 80 "Ma", modify
    label def l_org 81 "MaU", modify
    label def l_org 84 "MarU", modify
    label def l_org 86 "MüH", modify
    label def l_org 87 "MüL", modify
    label def l_org 88 "MüT", modify
    label def l_org 90 "Müw", modify
    label def l_org 92 "Ol", modify
    label def l_org 94 "Os", modify
    label def l_org 95 "Pa", modify
    label def l_org 97 "Po", modify
    label def l_org 98 "Re", modify
    label def l_org 99 "Ro", modify
    label def l_org 100 "Sa", modify
    label def l_org 101 "SiU", modify
    label def l_org 102 "St", modify
    label def l_org 104 "Tü", modify
    label def l_org 105 "Ul", modify
    label def l_org 111 "Wu", modify
    label def l_org 113 "Wü", modify
    
    
    twoway scatter bio year || scatter bio year if subsample==1 & bio>2, mlabel(org) mlabcolor(black) mlabpos(9) ||, name(biotest,replace) legend(off)
    Thanks for your suggesttions.

  • #2
    See Ulrich Kohler's mlabvpos() function for egen in egenmore from SSC.

    I usually suppress the marker and put the marker label where it would have been, but a glance at your data suggests that won't help much here.

    Comment


    • #3
      I've checked Ulrich Kohler's mlabvpos() function for egen in egenmore from SSC and I do not think it applies here.

      I suspect that it maybe a way to rank the organisations within a range.
      I would first identify if there are groups of organizations within a range. How to do this?
      Then rank the organisation within each group.

      Maybe somthing like this:
      Code:
      egen cut15 = cut(bio) if subsample==1, at(1(1.5)30)
      bys cut15: egen rank = rank(bio), field
      bys cut15: gen group_total = _N if subsample==1
      gen mlabpos = 9
      replace mlabpos = 10 if rank==1 & group_total > 1 & !missing(group_total)
      replace mlabpos = 9 if rank==2 & group_total > 1 & !missing(group_total)
      replace mlabpos = 8 if rank==3 & group_total > 1 & !missing(group_total)
      replace mlabpos = 7 if rank==4 & group_total > 1 & !missing(group_total)
      
      twoway scatter bio year || scatter bio year if subsample==1 & bio>2, mlabel(org) mlabcolor(black) mlabvpos(mlabpos) ||, name(biotest_rank,replace) legend(off)
      Does not look like the best solution as now 3rd is far off the 2nd... - but it gives an idea...

      Comment


      • #4
        In another context, I've been given the advise to look at R-package -ggrepel- (https://ggrepel.slowkow.com/index.html). It really looks like it has the functionality I search. But at the moment I have to stick to Stata alone. Manually editing the graph in Stata is faster than getting R set up and learned.

        Comment


        • #5
          Here is another attempt that looks to work okayish:
          It would be now good to be able to determine the necessary plus/minus for a good look.... But as a start it could be good.
          Code:
          clonevar bio_lab = bio
          replace bio_lab = bio + 0.6 if rank==1 & inrange(group_total,3,4)
          replace bio_lab = bio + 0.1 if rank==2 & inrange(group_total,3,4)
          replace bio_lab = bio - 0.2 if rank==3 & inrange(group_total,3,4)
          replace bio_lab = bio - 0.5 if rank==4 & inrange(group_total,4,4)
          
          twoway scatter bio year || ///
                  scatter bio year if subsample==1 & bio>2 || ///
                  scatter bio_lab year if subsample==1 & bio>2 , mlabel(org) mlabcolor(black) mlabpos(9)  msymbol(none) ||, name(biotest_rank,replace) legend(off)

          Comment

          Working...
          X