Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Two way bubble plots using weights


    Hi All, So this was posted awhile back without a helpful solution, I wanted to repost it to the list see if we could figure out a workaround. -scatter- supports weights. When a scatterplot is drawn, all weighted markers have a distinct size. This size changes when the data are divided into two or more groups and then drawn as overlaid scatterplots. I am looking for a way to retain the size of the original markers. The problem is best understood with an example. clear all input x y weight group 1 1 1 1 2 1 10 1 1 2 100 2 2 2 1000 2 end scatter y x [w=weight], name(A) twoway (scatter y x if group==1 [w=weight]) (scatter y x if group==2 [w=weight]), name(B) Compare graphs A and B. In graph A all four markers have a different size. In graph B there are two pairs of markers with the same size. I would like to have the same marker sizes in graph B as in graph A while keeping the colors that identify the two different groups. Is there a reasonable solution?

  • #2
    It may not be quite what you were hoping for, but you could do something like this relatively painlessly with D3js. If you have the Mata HTML and D3 Mata libraries there is an example function d3scatterTip.mata that can do this with a slight modification. On line 260 of the function where you see the text ".attr("r", 5)" you'd want to do something like:

    Code:
    .attr("r", "obj_function(d) { return d[groupvariable]; }")
    This sets the radius of the circle elements based on the value of the callback which would return a value on the variable groupvariable for each observation. The example already implements coloring points by different values of a third variable, so it seems like it would do what you would want and would have the added bonus of alpha layer transparency in case the points overlap too much.

    Comment


    • #3
      Please use CODE tags so that commands can be copied more easily.
      Code:
      clear all
      input x y weight group
      1 1 1 1 
      2 1 10 1 
      1 2 100 2 
      2 2 1000 2 
      end 
      scatter y x [w=weight], name(A) 
      twoway (scatter y x if group==1 [w=weight]) (scatter y x if group==2 [w=weight]), name(B)
      It looks like you copied the example from a Statalist post from 2008 at http://www.stata.com/statalist/archi.../msg00987.html. That post contained the solution to the problem you described. Quote:

      ...here is a solution to my problem that was provided by Stata tech support. One has to duplicate the observations and recode the group variable in the duplicates so that the weights are distributed across all groups.
      Why didn't you find this solution helpful?
      Code:
      . clear all
      . input x y weight group
      1 1 1 1
      2 1 2 1
      1 2 4 2
      2 2 8 2
      end
      . scatter y x [w=weight], name(A)
      . twoway (scatter y x if group==1 [w=weight]) (scatter y x if group==2 [w=weight]), name(B) legend(off)
      . expand 2
      . replace x = . if _n>(_N/2)
      . recode group (1=2) (2=1) if x==.
      . twoway (scatter y x if group==1 [w=weight]) (scatter y x if group==2 [w=weight]), name(C) legend(off)

      Comment


      • #4
        Hi Friedrich Huebler, thanks for providing a solution. Following up, my question is how to preserve the bubble size when adding marker labels. For example, adding the string variable "lab" in your example data,

        Code:
        clear all
        input x y weight group str5 lab
        1 1 1 1 gr001
        2 1 2 1 gr001
        1 2 4 2 gr002
        2 2 8 2 gr002
        end
        
        scatter y x [w=weight], name(A) ti(Graph A)
        scatter y x [w=weight], name(D) mlab(lab) ti(Graph D)
        The bubble size that we obtain in graph A is lost in graph D when adding the marker labels. Any hint is appreciated. Thank you.

        Click image for larger version

Name:	A.png
Views:	1
Size:	17.6 KB
ID:	1751823

        Click image for larger version

Name:	D.png
Views:	1
Size:	18.1 KB
ID:	1751824


        Attached Files

        Comment


        • #5
          Better to have hollow weighted markers. You need two scatter calls.

          Code:
          clear all
          input x y weight group str5 lab
          1 1 1 1 gr001
          2 1 2 1 gr001
          1 2 4 2 gr002
          2 2 8 2 gr002
          end
          
          scatter y x [w=weight], ms(oh) name(A, replace) ti(Graph A) || ///
          scatter y x, msy(none) mlab(lab) mlabgap(2) leg(off) xsc(r(1 2.1))
          Click image for larger version

Name:	A.png
Views:	1
Size:	26.7 KB
ID:	1751838



          A more efficient approach to resize the markers allowing for between-group comparisons is discussed in https://journals.sagepub.com/doi/10....36867X20931008.
          Last edited by Andrew Musau; 30 Apr 2024, 16:15.

          Comment


          • #6
            Thank you very much, Andrew Musau! I just got very good insights in your Stata Tip 136 to improve the hollow weighted markers in my actual graph. I'll probably get over some ideas in a separate post. Thanks!

            Comment

            Working...
            X