Help with Bar graph of three categorical variables and rates

Alex Anders

Join Date: Mar 2019
Posts: 6

Help with Bar graph of three categorical variables and rates

21 Mar 2019, 08:50

Dear Stata Community,

I am working with a dataset which has three categorical variables, whether product is defective or not, type of product and factory in which the product was produced. I want to plot out two graphs:

i) The percentage defective rate for each product type and factory. For this chart I'll have a total of eight groups and their corresponding defective rate.
ii) The defect pieces per 1000 for each product type and factory based on the percentage defective rate by product and factory. Here again, I'll have a total of eight groups.

Would appreciate any help on the above.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(defective_ind product factory)
0   4 1
0   1 1
0   3 1
0   2 1
0   2 1
1   1 2
0   2 2
0   1 2
0   3 1
0   3 1
0   2 2
0   4 1
0   3 1
0   1 2
0   2 1
0   3 2
0   2 1
0   2 1
1   3 1
0   2 1
0   2 2
0   3 1
0   2 2
1   2 2
0   2 1
1   4 1
0   3 1
0   1 2
1   1 2
0   2 2
0   2 2
0   3 1
0   3 1
1   2 2
0   1 2
0   2 1
0   1 1
0   4 1
1   2 1
0   3 1
0   2 1
0   2 2
0   2 1
1   3 1
0   3 1
0   3 1
0   2 2
0   3 1
1   3 1
0   4 1
1   3 1
0   4 1
0   2 1
0   2 2
1   3 2
0   2 2
0   4 1
0   3 1
1   2 1
0   2 1
0   2 1
0   2 1
0   2 2
1   2 1
0   3 2
0   3 1
0   2 1
0   4 2
0   3 1
1   2 1
0   2 1
0   2 2
1   2 2
0   1 1
0   3 1
1   2 2
0   2 1
0   3 1
0   2 2
1   3 1
0   2 1
0   3 1
0   1 1
0   2 1
1   3 2
0   3 1
0   2 1
1   2 1
0   2 2
0   1 2
0   3 1
0   2 1
0   3 2
0   2 2
0   2 1
1   2 2
0   3 1
0   2 2
1   2 1
end
label values defective_ind defec
label def defec 0 "Not Defective", modify
label def defec 1 "Defective", modify
label values product prod
label def prod 1 "Product 1", modify
label def prod 2 "Product 2", modify
label def prod 3 "Product 3", modify
label def prod 4 "Product 4", modify
label values factory fact
label def fact 1 "Factory 1", modify
label def fact 2 "Factory 2", modify

Best Regards

Alex

Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 10291

21 Mar 2019, 09:57

Thanks for the data example. For the first:

Code:

graph hbar (percent) defective_ind, over(factory) over(product) ytitle("Percent") \\\
title("Defective products by factory") scheme(s1color) bar(1, color(none)) blabel(bar, format(%9.2f))

I do not get what you need for the second.

Click image for larger version

Name: Graph.png
Views: 1
Size: 35.2 KB
ID: 1489350

Comment

Alex Anders

Join Date: Mar 2019

Posts: 6
#3

21 Mar 2019, 11:26

Hi Andrew,

Thanks a lot for responding to my first question.

What I meant for the second part, was that how many defective parts can I expect to see in a population of 1000 samples based on the defective rates that I am seeing. For example in the chart if I see a defect rate of ~4% for the product 1 being produced in factory 1. On a base of 1000 pieces, therefore I would expect to see 40 defective pieces on average. Similarly for it in the factory 2 I would expect to see 70 defective pieces. I wanted to create a chart depicting how many pieces can I expect to be defective on average on a base of 1000.

I matched the first graph with the sample data figures and the first graph has given the distribution of the the population across the eight groups, rather than the rate of incidence of defective products across the eight groups which I was looking for. Would you know, how would I be able to modify the chart for it.

For e.g there are 99 total observations, and there are four observations falling within the product 1 and factory 1 sub_group, which gives me the total population falling within the group as 4.04%. However, there are zero defective products in this group and therefore, I should have seen a defect rate of 0% for this group being populated.

Thanks

Alex

Last edited by Alex Anders; 21 Mar 2019, 11:41.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10291

21 Mar 2019, 12:13

Yes, sorry... #2 gives you the percentage of total. The following will give you the percentage defective rate per product-factory.

Code:

bys product factory: egen defective= total(defective_ind)
bys product factory: replace defective= (defective/_N)*100
gen def1000= defective*10

*GRAPH 1
graph hbar defective, over(factory) over(product) ytitle("Percent") title("Defective products by factory") \\\
scheme(s1color) bar(1, color(none)) blabel(bar, format(%9.2f)) ylab(0(10) 45)

*GRAPH 2
graph hbar def1000, over(factory) over(product) ytitle("Number of failures per 1000") title("Defective products by factory") \\\
scheme(s1color) bar(1, color(none)) blabel(bar, format(%9.0f)) ylab(0(100) 450)

Click image for larger version

Name: Graph.png
Views: 1
Size: 41.4 KB
ID: 1489377

Click image for larger version

Name: Graph1.png
Views: 1
Size: 42.3 KB
ID: 1489378

Comment

Alex Anders

Join Date: Mar 2019

Posts: 6
#5

21 Mar 2019, 12:31

Thanks a lot Andrew, this is exactly what I wanted.

Best Regards

Alex

Last edited by Alex Anders; 21 Mar 2019, 12:37.
Comment

Announcement

Help with Bar graph of three categorical variables and rates

Comment

Comment

Comment

Comment