Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • stripplot specifications

    Dear colleagues

    I have a variable named Total Compensation1 and another called Affiliationindex1 (from which I created a variable named AffliationIndex1q using xtile to generate 4 quartiles) and I want to plot the trend between the two using the command stripplot.

    I ran the following code:

    Code:
    stripplot TotalCompensation1, over(AffliationIndex1q)  cumul cumprob box centre vertical refline  yla( , ang(h)) xla(, noticks) xsc(titlegap(*5))
    and generated this:


    Click image for larger version

Name:	trend.png
Views:	1
Size:	67.5 KB
ID:	1491023



    Although the trend is somewhat clear (if one looks at the means grey, lines) I wonder if any of you have an idea about changing the options of the command stripplotin order to zoom in and clearly show the trend? maybe by controlling the height so the max is 50k. I tried something like this:

    Code:
    stripplot TotalCompensation1, over(AffliationIndex1q)  cumul cumprob box centre vertical refline  yla(0 10000 20000 30000 40000 50000 , ang(h)) xla(, noticks) xsc(titlegap(*5))
    but did not work, I also tried the ceiling and floor options (I guess I did not understand how they work).

    Any Idea?

    Thanks

  • #2
    In general, Stata graphics commands will plot all the points in the data, adjusting or ignoring options like ylabel in order to do so.

    Something like
    Code:
    stripplot TotalCompensation1 if TotalCompensation1<=50000, over(AffliationIndex1q) ...
    may accomplish what you want.

    Comment


    • #3
      stripplot is from SSC.

      If you omit observations, the summary statistics shown will all be for the subset chosen.

      it seems to me that there is a much natural choice, to use a logarithmic scale. Also show means if you like, but geometric means seem to me a more appropriate summary.

      ceiling and floor options tune the details of binning and have nothing to do with which observations are used.

      Comment


      • #4
        Thank you William and Nick.

        based on Nick comment I ran the following code:

        Code:
         
        egen gmean = gmean(TotalCompensation1), by(AffliationIndex1q) 
        
        stripplot logTotalCompensation1, over(AffliationIndex1q)  cumul cumprob box centre vertical refline reflevel(gmean) yla( , ang(h)) xla(, noticks) xsc(titlegap(*5))
        is the gray line now represent the geometric mean and black (inside the box) represent the mean! it is my understanding that the one inside the box is the median not the mean!

        Click image for larger version

Name:	trend.png
Views:	1
Size:	51.7 KB
ID:	1491056

        Comment


        • #5
          Correct. I would use ysc(log) myself. You now need to think about the outlier(s) revealed.

          Comment


          • #6
            Thank you very much Nick

            Comment

            Working...
            X