Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Markers on scatter plot overlapping the labels


    Hi I'm trying to produce a scatter plot but unfortunately the markers in the diagram overlap some of the labels of other markers. Changing the position of the label relative to the marker will not help because there are markers at any degre around the labels ...
    Therefore I want to set the labels to be above all the markers (and not only above some of them like can be seen in the example picture below). How can this be done?

    This is my command:

    graph twoway (scatter Etn psychometric, mlabel(labels) mlabv(pos) mlabcolor(grey)) (lfit Etn psychometric), graphregion(color(gs16)) legend(size(small) label(1 "Majors") label(2 "Linear fit")) /// ytitle("Share of Etn Applications" " ", size(small)) xtitle(" " "Mean Psychometric of Applicants" " ", size(small)) title("Share of Etn Applications and Mean Psychometric" "For Each Major") /// subtitle("First choice, age <= 30")


    Click image for larger version

Name:	e.PNG
Views:	1
Size:	2.3 KB
ID:	1393422


    Thanks, Ami

  • #2
    Originally posted by Ami pe View Post
    Therefore I want to set the labels to be above all the markers (and not only above some of them like can be seen in the example picture below).
    What do you mean by "above all the markers"? Please explain in more detail what exactly you would like to do.

    It would also be helpful if you could post an excerpt from your data, preferably with dataex. Please read section 12 in the FAQ, "What should I say about the commands and data I use?"

    Comment


    • #3
      Just as a side note, after Friedrich's helpful advice: it seems what is presented "above" the x-axis, is just a label of one of the values. I gather it has something to do with - mlabel- , but that is guessing work, since you unfortunately didn't share the full picture.
      Best regards,

      Marcos

      Comment


      • #4
        I'd note further that if "Electrical engineering" is a typical marker label and you have dozens of data points, some text will predictably occlude something else whatever you do.

        The code you cite uses a variable as argument to mlabvpos(). It is hard to see that putting all labels above markers could improve the problem overall.

        Comment


        • #5
          hi,

          I'm sorry.
          This is the full code:

          use "C:\....dta", clear

          collapse (mean) Etn psychometric, by(majors_united_appl)

          gen labels = majors_united_appl if inlist(majors_united_appl, 43, 66, 61, 71, 27, 19, 62, 30, 68, 70, 26, 32)
          replace labels = 0 if labels == .

          label define majors_united 0 " ", modify

          label values labels majors_united


          generate pos = 3

          replace pos = 9 if labels == 30
          replace pos = 6 if labels==32 | labels==70 | labels==71
          replace pos = 12 if labels==26


          graph twoway (scatter Etn psychometric, mlabel(labels) mlabv(pos) mlabcolor(grey)) (lfit Etn psychometric), graphregion(color(gs16)) legend(size(small) label(1 "Majors") label(2 "Linear fit"))
          ///
          ytitle("Share of Etn Applications" " ", size(small)) xtitle(" " "Mean Psychometric of Applicants" " ", size(small)) title("Share of Etn Applications and Mean Psychometric" "For Each Major")
          ///
          subtitle("First choice, age <= 30")






          And this is the full plot:

          Click image for larger version

Name:	Graph.png
Views:	1
Size:	44.2 KB
ID:	1393640



          As you can see some of the markers overlay the lables of other markers... I want the labels to overlay the markers whenever they meet so they can be read. How can it be done?

          Thank you

          Comment


          • #6
            In this graph, it is not clear to which marker certain labels belong. Where are the markers for occupational therapy, computer science, and electrical engineering? Placing the words "Electrical Engineering" somewhere where they don't overlap with markers doesn't solve this problem. You could consider leader lines or drawing labeled markers with a different color or shape.

            Please use CODE delimiters next time, it makes Stata commands easier to read. The use of CODE delimiters is explained in the FAQ.

            Comment


            • #7
              I think I understand your request. You want the labels to cover the markers and not the other way around. To do that, you have to draw the labels last and not first. Instead of this:
              Code:
              graph twoway (scatter Etn psychometric, mlabel(labels) mlabv(pos) mlabcolor(grey)) ///
              (lfit Etn psychometric)
              try something similar to this:
              Code:
              graph twoway (scatter Etn psychometric) (lfit Etn psychometric) ///
              (scatter Etn psychometric, msymbol(none) mlabel(labels) mlabv(pos) mlabcolor(grey))
              You still have to consider how readers are supposed to know which label is associated with which marker.

              Comment


              • #8
                Thank you very much.

                Do you think this is better?
                Click image for larger version

Name:	Graph.png
Views:	1
Size:	42.9 KB
ID:	1393748

                Comment


                • #9
                  The revised graph looks better. The labels are no longer ambiguous.

                  Comment


                  • #10
                    The linear summary isn't very convincing. It may be irrelevant to your purposes, but I would consider plotting the response on a transformed scale (logarithm if all values are positive, otherwise square root or cube root).

                    Comment


                    • #11
                      Is there a way to add "leader lines" to a graph like this? I'm imagining these are lines from the labels to the dots. Friedrich Huebler suggested them earlier but didn't give any hints as to how to implement.

                      Comment


                      • #12
                        Relatively relevant to this, in my SE v15.0, the option jitter is ignored when mlabel options are applied.This may result in overlapping points and labels
                        In the following example, I am trying to emphasize and apply labels only to salient points of var3 (ie meeting [some other conditions]). The non salient points are jittered and readable, the salient ones are lined and more importantly, as a consequence have overlapping labels. The overuse of jitter mentioned in https://www.statalist.org/forums/for...6-scatter-plot does not concern this particular graph as the x var (t) is categorical time so whether one datapoint is -say- at week 12 or week 11.5 or 13 does not affect the interpretation.
                        Is there a way to circumvent this?
                        I considered the egenemore function discussed in https://www.statalist.org/forums/for...label-disperse
                        but the labels I want come from another explanatory variable, not either of the two plotted in twoway

                        foreach i of local id {
                        twoway(line var1 t if ID=="`i'" , lcolor(navy)) ///
                        (line var2 t if ID=="`i'" , lcolor(maroon)) ///
                        (scatter var3 t if ID=="`i'" &
                        [some conditions], jitter(5) msize(tiny) mcolor(dkgreen%20) yaxis(2)) ///
                        (scatter var3 t if ID=="`i'" &
                        [some other conditions], jitter(3) msize(tiny) mcolor(dkgreen) mlabel(var4) mlabsize(vsmall) yaxis(2)), ///
                        graph options
                        }

                        Thank you very much for any suggestions

                        Comment

                        Working...
                        X