Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Editing graph

    Click image for larger version

Name:	Plot.PNG
Views:	1
Size:	41.5 KB
ID:	1696931

    Deal Statalists,

    I have data. I am going to plot that. There is a problem, two values in my data are close to each other. Once I plot, they cover each other in a way that is not readable. I have tried to edit the graph by changing positing to 9, clock, etc., but still, they are not readable. Any ideas are appreciated.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(shar_noimm imm_shar_dis) str8 share_2
    .7408003   0 ""        o
    .9431347 1.3 "Norte"  
    .6581098  16 " Algarve"
    .8896656 2.7 "Centro"  
    .7551286 7.9 "Lisboa"  
    .8753697 3.5 "Alentejo"
    .9461963 1.4 "Açores"
    .9048818 2.5 "Madeira"
    
    twoway (scatter shar_noimm imm_shar_dis, mlabel(share_2) mcolor(gs10) msize(small)) (lfit shar_noimm imm_shar_dis , lc(black))
    
    end
    Last edited by Paris Rira; 12 Jan 2023, 07:32.

  • #2
    You don't show the code you tried. I see the code to make the graph, but what about the code to change the positioning of the marker labels?

    EDIT: And since these values are so close, is it worth it to include the names, anyways? I don't think there are many ways to make it look pretty with this specification, but I'm open to being proven wrong.

    Comment


    • #3
      Your code needs editing.

      1, The end statement is the wrong place, so the code will fail,

      2. There is no district name for the first observation.

      3. There is a stray character in the first observation, which Stata ignores,

      Otherwise after editing and inventing a new variable the problem can be lessened,

      I have a strong preference for putting all detail in a script. The Graph Editor can be essential but some tweaks, but I still want to want to minimise its use.


      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float(shar_noimm imm_shar_dis) str8 share_2 float pos
      .7408003   0 ""         3
      .9431347 1.3 "Norte"    9
      .6581098  16 "Algarve"  9
      .8896656 2.7 "Centro"   3
      .7551286 7.9 "Lisboa"   3
      .8753697 3.5 "Alentejo" 3
      .9461963 1.4 "Açores"  3
      .9048818 2.5 "Madeira"  3
      end
      
      set scheme s1color
      
      twoway lfit shar_noimm imm_shar_dis , lc(black) ///
      || scatter shar_noimm imm_shar_dis, ms(Oh) msize(medlarge) mlabel(share_2) mcolor(gs10) msize(small) mlabvpos(pos) ytitle(Better title needed here) yla(, ang(h)) xtitle(and here too) legend(off)
      Click image for larger version

Name:	portuguese.png
Views:	1
Size:	20.7 KB
ID:	1696936



      EDIT The chemists were a way ahead of most fields in agreeing early on one- or two-letter abbreviations for elements such as H or Zn that people could learn. (I still remember many from school chemistry decades ago). And two- or three-letter abbreviations are widely used for countries and areas within countries. I looked quickly for such abbreviations for areas of Portugal but was probably looking in the wrong place.

      Last edited by Nick Cox; 12 Jan 2023, 08:43.

      Comment


      • #4
        Originally posted by Jared Greathouse View Post
        You don't show the code you tried. I see the code to make the graph, but what about the code to change the positioning of the marker labels?
        Thank you Jared for getting back to me.
        Actually, I have not used syntax for editing. I just right-click --> start Graph Editor--> Marker label properties (the easy way ).

        Comment


        • #5
          Nick Cox -

          I just want to emphasize a point in your post thing might be overlooked by those like me who beat on Graph Editor occasionally.

          When I used Graph Editor to approach this problem, I learned from the dialog box about mlabpos() which takes a constant argument to specify the position of the marker label. Because I didn't carefully read the fine material in the documentation subsequently, I didn't learn about mlabvpos() which takes a variable name argument and uses the value for each observation to provide the label position for that observation.

          This is a great example of an elegant solution to the problem posed by labeling points in close juxtaposition. I'm glad you got this up before I had a chance to post my embarrassing hack. I already embarrassed myself yesterday, no need to try for a streak.

          Comment


          • #6
            Originally posted by Nick Cox View Post
            Your code needs editing.


            2. There is no district name for the first observation.
            Hi Nick. Thank you for the reply.
            In the data there are a few values that do not belong to any regions. I am still in a dilemma to drop them before plotting the graph or putting 'Unkown' in the graph for them. Of course, its absence /presence does not add/create trouble.

            Your code has 'pos', whats that?

            Comment


            • #7
              Your code has 'pos', whats that?
              It is the name of the variable added to the data to specify, for each observation, the position of the marker label - the argument to the mlabvpos() argument. By placing the two overlapping labels at different positions they no longer overlap.

              Comment


              • #8

                AA, that's perfect. I appreciate your precise explanation, dear William.

                Comment


                • #9
                  William Lisowski We all embarrass ourselves quite often. One would hope that people would be judging us by some kind of batting average but the evidence on social media often runs otherwise.

                  We presumably share the criticism of one academic not known to have helped anyone ever in any related forum:

                  Stata forums are filled with people who seem to hoard their expertise.

                  Comment


                  • #10
                    Nick Cox's code is great. And I second his remarks about country/region/city abbreviations. An even cleaner chart can be derived with the following

                    Code:
                    twoway ///
                        lfit shar_noimm imm_shar_dis , lc(black) ///
                        || scatter shar_noimm imm_shar_dis ///
                        , ms(i) mlabel(share_2)  mlabvposition(pos)  ///
                            ytitle(Better title needed here) yla(, ang(h)) ///
                            xtitle(and here too) legend(off)

                    Note the invisible marker symbol trick

                    Comment


                    • #11
                      See also https://www.stata-journal.com/articl...article=gr0023 if more examples are of interest. The last figure there uses single letters such as m for marine and f for forest for some ecological data. The example is helped because there is some simple clustering in the data. If not we may need other devices.

                      There can be a clash between clarity and convention. Thus it is standard that lower case letters are more variable visually than upper case letters. But anyone who used that as an argument for showing oh wy and ut rather than OH WY and UT in a plot of US states would probably be regarded as puzzling if not perverse.

                      Comment

                      Working...
                      X