Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing lines in graph dot, over() nofill

    When graph dot is used in combination with over() and nofill options, no line is drawn when one or more (but not all) variables have missing values for one of the over() categories. As an example, here is a graph created with the auto data in which the variables mpg and turn are either jointly present or missing for all values of rep78.
    Code:
    sysuse auto, clear
    replace mpg = . if rep78==2
    replace turn = . if rep78==2
    graph dot mpg turn, over(rep78) nofill
    Click image for larger version

Name:	g1.png
Views:	1
Size:	7.1 KB
ID:	1395283


    If turn is missing for all cases where rep78 = 4, no line is drawn for rep78 = 4.
    Code:
    replace turn = . if rep78==4
    graph dot mpg turn, over(rep78) nofill
    Click image for larger version

Name:	g2.png
Views:	1
Size:	6.9 KB
ID:	1395284


    The only solution I found is to assign some value to observations with missing values so that the marker and line are drawn; subsequently I have to remove the marker with the Graph Editor. In the example below, the value 0 is assigned to all missing values of turn where rep78 = 4.
    Code:
    replace turn = 0 if rep78==4
    graph dot mpg turn, over(rep78) nofill
    Click image for larger version

Name:	g3.png
Views:	1
Size:	7.1 KB
ID:	1395285


    Notice the marker at turn = 0 and rep78 = 4. This marker can now be removed with the Graph Editor or alternatively with the undocumented command gr_edit.
    Code:
    gr_edit .plotregion1.points[3].Delete
    Click image for larger version

Name:	g4.png
Views:	1
Size:	7.1 KB
ID:	1395286


    Is there a solution to this problem that does not involve the Graph Editor?

  • #2
    Friedrich: Your problem highlights that Stata does the plots sequentially, so -nofill- omits empty categories for the last variable plotted. The solution is to plot the variable with no empty categories last. Compare


    Code:
     replace turn = . if rep78==4
    graph dot mpg turn, over(rep78) nofill
    and

    Code:
     replace turn = . if rep78==4
    graph dot turn mpg, over(rep78) nofill
    Last edited by Andrew Musau; 29 May 2017, 13:35.

    Comment


    • #3
      Originally posted by Andrew Musau View Post
      Your problem highlights that Stata does the plots sequentially, so -nofill- omits empty categories for the last variable plotted. The solution is to plot the variable with no empty categories last.
      Thanks, that is an interesting observations. Unfortunately, it doesn't help with my real data, which has missing observations in all variables. This means there is no variable without missing data that I could plot last.

      Comment


      • #4
        What about just omitting -nofill-? Do you have an example with several variables and missing categories to illustrate the problem. Maybe someone can suggest a workaround.

        Comment


        • #5
          Below is another example. I need nofill because the graph is too crowded otherwise.
          Code:
          sysuse auto, clear
          replace mpg = . if rep78==2
          replace turn = . if rep78==2 | rep78==4
          graph dot mpg turn, over(rep78) over(foreign) nofill
          Click image for larger version

Name:	graph.png
Views:	1
Size:	7.5 KB
ID:	1395332

          Comment


          • #6
            sysuse auto, clear
            replace mpg = . if rep78==2
            replace turn = . if rep78==2 | rep78==4
            graph dot mpg turn, over(rep78) over(foreign) nofill
            With this example again, reversing the order does the trick because the missing values all belong to the variable turn.

            Code:
             graph dot turn mpg, over(rep78) over(foreign) nofill
            A more relevant example illustrating Friedrich's problem is the following for anyone who wants to attempt a solution. At the moment I cannot give time to it due to other commitments.

            Code:
            sysuse auto, clear
            replace mpg = . if rep78==2
            replace turn = . if rep78==2
            replace turn = . if rep78==4 & foreign==1
            replace mpg = . if rep78==4 & foreign==0
            graph dot mpg turn, over(rep78) over(foreign) nofill

            Last edited by Andrew Musau; 30 May 2017, 02:16.

            Comment


            • #7
              Thank you for the additional example.

              Meanwhile I found that I reported this problem to Stata tech support in 2013 and was told that the nofill option appears to have a bug. If that is the case, the bug has not been fixed.

              Comment


              • #8
                I agree Friedrich that this is a bug in nofill. Sequential plotting assumes independence over plots which is not the case if one specifies -over()- . You should re-report the problem to technical services. Meanwhile, you can consider the following workaround based on the example in #6:


                Code:
                sysuse auto, clear
                replace mpg = . if rep78==2
                replace turn = . if rep78==2
                replace turn = . if rep78==4 & foreign==1
                replace mpg = . if rep78==4 & foreign==0
                graph dot mpg turn, over(rep78) over(foreign) nofill
                
                *// Group all over categories
                egen group = group (foreign rep78), label
                *// Generate indicator for observations for which all variables are missing
                gen allmiss= cond(mpg==.&turn==., 1,0)
                *// Preserve and drop these observations 
                preserve
                drop if allmiss
                *// Plot without -nofill- option
                graph dot mpg turn, over(group) 
                restore

                Click image for larger version

Name:	gr_dot.png
Views:	1
Size:	11.8 KB
ID:	1395528

                Comment


                • #9
                  Thank you for the creative workaround. I work with national data and group countries by region, as in:
                  Code:
                  graph dot var1 var2 var3, over(country) over(region)
                  Labels that combine region and country names would not be practical. The over() option also adds gaps between the groups that make it easy to see which countries belong to which region (see the example in post #2).

                  I will report the problem to Stata tech support again. Until there is a fix for the nofill bug, I can create the graphs I need with the Graph Editor, as described in the first post in this thread.

                  Comment


                  • #10
                    To save you time, you can take advantage of the fact that the graph allows you to specify the color of the marker and suppress part of the legend. Create a zero mean variable and place it last in the order of your variables. Then specify a color that matches the background and exclude this variable from the legend.

                    Code:
                    sysuse auto, clear
                    replace mpg = . if rep78==2
                    replace turn = . if rep78==2
                    replace turn = . if rep78==4 & foreign==1
                    replace mpg = . if rep78==4 & foreign==0
                    *// Generate indicator for observations for which all variables are missing
                    gen allmiss= cond(mpg==.&turn==., 1,0)
                    *// Preserve and drop these observations
                    preserve
                    drop if allmiss
                    gen zero=0
                    graph dot mpg turn zero, over(rep78) over(foreign) nofill marker(3, mcolor(white))leg(on order(1 2))
                    restore

                    Click image for larger version

Name:	gr2_dot.png
Views:	1
Size:	11.2 KB
ID:	1396841

                    Comment


                    • #11
                      Thank you, that is an interesting and convenient solution. I would only change the color of the marker for the zero variable to "none". If you look closely at the graph in the last post you can see the white markers along the vertical axis.
                      Code:
                      graph dot mpg turn zero, over(rep78) over(foreign) nofill marker(3, mcolor(none)) leg(on order(1 2))

                      Comment


                      • #12
                        Thanks. I was not aware that you can specify no color for a marker. This will certainly come in handy in the future!

                        Comment

                        Working...
                        X