I am working on visualizing the distribution of quality reports across sites and am using bar plots to do so. In this example, each factory reports the proportion of their product of 'bad', 'ok', and 'good' quality, as well as the total amount of product produced. I want to show how this distribution varies across factories, and show the variability in factory size. It was simple to show with uniform bar widths. However, when I went to vary the bar widths by the total amount produced it got a lot more complicated. I have functional code, but am wondering if there is an easier or cleaner way to accomplish this. I'd appreciate any advice on how to clean this up.
Here is a simple example dataset
Here is the unweighted graph:
Here is the variable bar width code. I will plot three different overlaid bar graphs and so must convert the category percentages into cumulative percentages. Then I create an x index based on the cumulative relative size of each factory. Why do I need to add an extra line of data with a final x position to prevent the omission of the last group?

Here is a simple example dataset
Code:
clear input id bad ok good n 1 .3 .5 .2 100 2 .2 .7 .1 200 3 .4 .1 .5 150 4 .1 .8 .1 400 5 .15 .55 .3 50 end
Code:
graph bar bad ok good, over(id, sort(bad) descending) stack /// legend(label(1 bad) label(2 ok) label(3 good)) ytitle("Proportion of Product") title("Product Quality by Factory")
Here is the variable bar width code. I will plot three different overlaid bar graphs and so must convert the category percentages into cumulative percentages. Then I create an x index based on the cumulative relative size of each factory. Why do I need to add an extra line of data with a final x position to prevent the omission of the last group?
Code:
* transform y variables into cumulative variables egen c_bad = rowtotal(bad) egen c_ok = rowtotal(c_bad ok) egen c_good = rowtotal(c_ok good) * sort the data by the focus variable gsort -c_bad * create width weights and a cumulative x-axis index sum n gen w = n/(`r(mean)') drop x gen x = 0 if [_n]==1 replace x = w[_n-1] + x[_n-1] if [_n]!=1 * add in an extra row as the last x value doesn't get shown set obs `=_N + 1' replace x = x[_n-1] + w[_n-1] if [_n] == [_N] graph twoway /// bar c_good x, bartype(spanning) || /// bar c_ok x, bartype(spanning) || /// bar c_bad x, bartype(spanning)||, /// legend(label(3 bad) label(2 ok) label(1 good)) ytitle("Proportion of Product") title("Product Quality Distribution") xtitle("factory size weighted index")
Comment