At this point, I think I've googled this enough to know that there is no answer aside from doing something with -stripplot- (user written by Nick Cox) or laying multiple rbars / rspikes / rcaps over each other (written up by Nick Cox in a Stata Journal article on customizing boxplots), but I'm going to send this out into the world anyway and maybe someone will hear my cry for help. Or maybe simply say "I hear your pain."
graph box has a fatal flaw--the 1.5x IQR whiskers. It's canonical, but it's also not intuitive, and I am not sure anyone uses it as a measure of variation, etc. It would make way more sense to replace it with the 10th and 90th percentiles, or even just to hide the whiskers entirely. I don't pretend to be a programmer, but how hard could it be to have that as an option?
I spent a full day doing nothing but trying to make a box plot with four variables across four different conditions. I failed. I can get most of the way with:
If it wasn't for the IQR x1.5, this gives me exactly what I want. Maybe I want to add the mean on top of that, but that's trivial.
Instead, here are the options I can find if I want to escape Tukey's deranged decision of an IQR x1.5, all of which I think was written by Nick Cox in some form or another:
Thanks for listening to me rant,
Jonathan
graph box has a fatal flaw--the 1.5x IQR whiskers. It's canonical, but it's also not intuitive, and I am not sure anyone uses it as a measure of variation, etc. It would make way more sense to replace it with the 10th and 90th percentiles, or even just to hide the whiskers entirely. I don't pretend to be a programmer, but how hard could it be to have that as an option?
I spent a full day doing nothing but trying to make a box plot with four variables across four different conditions. I failed. I can get most of the way with:
Code:
graph box var1 var2 var3 var4, over(condition, sort(num_condition)) nooutsides
Instead, here are the options I can find if I want to escape Tukey's deranged decision of an IQR x1.5, all of which I think was written by Nick Cox in some form or another:
- Use -stripplot-, hide the data points, swap in the percentiles, mess around with the margins or something so that it looks good. I've done this in the past, and it looks great, but there is one problem: you can't do multiple variables combined with over(). (Ref: https://www.statalist.org/forums/for...olling-margins, see also (https://www.statalist.org/forums/for...updated-on-ssc))
- Stack a zillion different plots on top of each other, hoping you don't make an error (Ref: https://journals.sagepub.com/doi/pdf...867X0900900309)
- Collapse the dataset into only three variables (p25, p50, p75). This doesn't give me the 10th and 90th percentiles, but it does at least hide the IQR x1.5 (Ref: https://journals.sagepub.com/doi/pdf...36867X19893643)
Thanks for listening to me rant,
Jonathan
Comment