Dear Stata-listers
This is my first time posting to this list, so I hope I'm able to do this "the right way".
What I would like to do, is to create a number of graphs using - graph hbar - and to have these graphs show number of respondents (N) for for each variable. Some graphs contain up to 7 different variables, for which my current code calculates the mean values. I would like to have the graph show N (respondents) for each variable to the left in the graph and the mean value of the variable to the right (outside the particular bar) (see example graph provided below).
So, as the example-graph illustrates, currently I'm able to have my graphs show N when I'm using the user written program - catplot - ( - catplot - has been written by Kit Baum and can be found on SSC ( - findit catplot - )), which is for plots of categorical data showing frequencies, fractions or percents i Stata. I hope that the code I am adding further down in this post is able to have you generate the same graph using auto.dta provided by Stata. Anyway, - this code gives me what I need for the graphs showing percentages for each value of a variable separately for a number of groups and N for each group of respondents that answered the particular question (the data I'm myself using is a type of student survey data, so the variables have Likert scales and the categories are typically master- and bachelor students, or students attending different study programmes, such as nursing, engineering, law etc)
However, I'm not able to do this for mean values of the variables when using the command - graph hbar - .
(By the way, these particular graphs are sometimes drawn separately for different groups and sometimes not)
Could anyone please advice me as to how I can create a graph showing mean values (using - graph hbar - ) instead of percentages, and to have N included in the same manner as the graphs I have currently made using - catplot - ? (see example graph which should be attached below)
************************************************** **********************************
// An example using auto.data and - catplot - :
//First: making variable(s) containing origin of cars
sysuse auto, clear
ta foreign
ta foreign, nol
set more off
foreach var of varlist rep78 { //My own loop contains several variables, but for simplicity there's only 1 variable now
ge N3`var' = 0 if foreign==0 & `var' <=5 //Domestic
replace N3`var' = 1 if foreign==1 & `var' <=5 //Foreign
}
capture la drop origin2
la def origin2 0"Domestic" 1"Foreign"
la val N3rep78 origin2
label list origin2
capture la drop N3rep78
foreach var of varlist N3rep78 {
la var `var' "Prep for N for origin in graphs"
}
ta rep78
ta N3rep78
ta rep78 foreign
//Second: drawing the - catplot - graph:
capture drop clone
capture drop origin2
foreach val of varlist N3rep78 {
clonevar clone = `val'
decode `val', gen(origin2)
bysort `val': replace origin2=origin2 + " (n="+string(_N)+")"
labmask clone, values(origin2)
#delimit ;
catplot rep78 clone, stack asyvars percent(clone)
bar(1, fcolor(gs15)) bar(2, fcolor(gs14)) bar(3, fcolor(gs13)) bar(4, fcolor(gs12)) bar(5, fcolor(gs11))
blabel(bar, pos(center) format(%2.0f)) legend (pos(bottom) col(5))
ysize(3) yla(0(20)100)
plotregion(lcolor(none))
scheme(s1mono)
title("Repairs for foreign and domestic cars")
ytitle("(Percentages calculated for each value of rep78)")
note(" " " Note: Mean of rep78 for foreign =__ and for domestic=__", span) //<--this I would ideally have liked to automate (i.e. include mean values), but I have not been able to
legend(keygap(0.5) symxsize(9))
name(rep78_origin_N1 , replace) ;
drop clone origin2 ;
#delimit cr
}
************************************************** ***************************************
// Here follows my code for the graphs I would like to include N, but have not been able to:
// graph hbar over two categories
//here I would like to have N to say 48 and 21, since this is the number of domestic and foreign cars in rep78:
ta rep78 foreign
#delimit ;
graph hbar (mean)rep78 ,
over(foreign, relabel(1"DOMESTIC" 2"FOREIGN") gap(*2.5) label(labcolor(gs1)))
showyvars
yvaroptions(relabel(1 "Repair records")
gap(*1.5) label(labcolor(black) labsize(small)))
bar(1, fcolor(gs9))
blabel(bar, pos(outside) format(%12.1f))
ysize(3) yla(1(1)5)
exclude0
legend(off)
plotregion(lcolor(none))
scheme(s1mono)
title("Mean repair record for domestic and foreign cars" " ", size(large) span)
ytitle(" ""(Scale: Number of repairs)" "", size(small))
name(rep78_origin_N2 , replace) ;
graph save rep78_origin_N2, replace;
#delimit cr
// graph hbar - not separately for any categories
//here I would like N to show that N = 69:
ta rep78
#delimit ;
graph hbar (mean)rep78,
asyvars
showyvars
yvaroptions(relabel(1 "Repair record")
gap(*1.5) label(labcolor(black) labsize(small)))
bar(1, fcolor(gs9))
blabel(bar, pos(outside) format(%12.1f))
ysize(3) yla(1(1)5)
exclude0
legend(off)
plotregion(lcolor(none))
scheme(s1mono)
title("Mean repair record for cars, regardless of origin" " ", size(large) span)
ytitle(" ""(Scale: Number of repairs)" "", size(small))
name(rep78_origin_N3 , replace) ;
graph save rep78_origin_N3, replace;
#delimit cr
I guess some of the problem is caused by me using the - relabel - option in graph hbar.
However, I would not like to have the graphs show simply the variable names, since these are not descriptive enough for my audience
Any help on this matter is greatly appreciated. Thank you all so much in advance.
Best wishes,
Hilde Johanne
This is my first time posting to this list, so I hope I'm able to do this "the right way".
What I would like to do, is to create a number of graphs using - graph hbar - and to have these graphs show number of respondents (N) for for each variable. Some graphs contain up to 7 different variables, for which my current code calculates the mean values. I would like to have the graph show N (respondents) for each variable to the left in the graph and the mean value of the variable to the right (outside the particular bar) (see example graph provided below).
So, as the example-graph illustrates, currently I'm able to have my graphs show N when I'm using the user written program - catplot - ( - catplot - has been written by Kit Baum and can be found on SSC ( - findit catplot - )), which is for plots of categorical data showing frequencies, fractions or percents i Stata. I hope that the code I am adding further down in this post is able to have you generate the same graph using auto.dta provided by Stata. Anyway, - this code gives me what I need for the graphs showing percentages for each value of a variable separately for a number of groups and N for each group of respondents that answered the particular question (the data I'm myself using is a type of student survey data, so the variables have Likert scales and the categories are typically master- and bachelor students, or students attending different study programmes, such as nursing, engineering, law etc)
However, I'm not able to do this for mean values of the variables when using the command - graph hbar - .
(By the way, these particular graphs are sometimes drawn separately for different groups and sometimes not)
Could anyone please advice me as to how I can create a graph showing mean values (using - graph hbar - ) instead of percentages, and to have N included in the same manner as the graphs I have currently made using - catplot - ? (see example graph which should be attached below)
************************************************** **********************************
// An example using auto.data and - catplot - :
//First: making variable(s) containing origin of cars
sysuse auto, clear
ta foreign
ta foreign, nol
set more off
foreach var of varlist rep78 { //My own loop contains several variables, but for simplicity there's only 1 variable now
ge N3`var' = 0 if foreign==0 & `var' <=5 //Domestic
replace N3`var' = 1 if foreign==1 & `var' <=5 //Foreign
}
capture la drop origin2
la def origin2 0"Domestic" 1"Foreign"
la val N3rep78 origin2
label list origin2
capture la drop N3rep78
foreach var of varlist N3rep78 {
la var `var' "Prep for N for origin in graphs"
}
ta rep78
ta N3rep78
ta rep78 foreign
//Second: drawing the - catplot - graph:
capture drop clone
capture drop origin2
foreach val of varlist N3rep78 {
clonevar clone = `val'
decode `val', gen(origin2)
bysort `val': replace origin2=origin2 + " (n="+string(_N)+")"
labmask clone, values(origin2)
#delimit ;
catplot rep78 clone, stack asyvars percent(clone)
bar(1, fcolor(gs15)) bar(2, fcolor(gs14)) bar(3, fcolor(gs13)) bar(4, fcolor(gs12)) bar(5, fcolor(gs11))
blabel(bar, pos(center) format(%2.0f)) legend (pos(bottom) col(5))
ysize(3) yla(0(20)100)
plotregion(lcolor(none))
scheme(s1mono)
title("Repairs for foreign and domestic cars")
ytitle("(Percentages calculated for each value of rep78)")
note(" " " Note: Mean of rep78 for foreign =__ and for domestic=__", span) //<--this I would ideally have liked to automate (i.e. include mean values), but I have not been able to
legend(keygap(0.5) symxsize(9))
name(rep78_origin_N1 , replace) ;
drop clone origin2 ;
#delimit cr
}
************************************************** ***************************************
// Here follows my code for the graphs I would like to include N, but have not been able to:
// graph hbar over two categories
//here I would like to have N to say 48 and 21, since this is the number of domestic and foreign cars in rep78:
ta rep78 foreign
#delimit ;
graph hbar (mean)rep78 ,
over(foreign, relabel(1"DOMESTIC" 2"FOREIGN") gap(*2.5) label(labcolor(gs1)))
showyvars
yvaroptions(relabel(1 "Repair records")
gap(*1.5) label(labcolor(black) labsize(small)))
bar(1, fcolor(gs9))
blabel(bar, pos(outside) format(%12.1f))
ysize(3) yla(1(1)5)
exclude0
legend(off)
plotregion(lcolor(none))
scheme(s1mono)
title("Mean repair record for domestic and foreign cars" " ", size(large) span)
ytitle(" ""(Scale: Number of repairs)" "", size(small))
name(rep78_origin_N2 , replace) ;
graph save rep78_origin_N2, replace;
#delimit cr
// graph hbar - not separately for any categories
//here I would like N to show that N = 69:
ta rep78
#delimit ;
graph hbar (mean)rep78,
asyvars
showyvars
yvaroptions(relabel(1 "Repair record")
gap(*1.5) label(labcolor(black) labsize(small)))
bar(1, fcolor(gs9))
blabel(bar, pos(outside) format(%12.1f))
ysize(3) yla(1(1)5)
exclude0
legend(off)
plotregion(lcolor(none))
scheme(s1mono)
title("Mean repair record for cars, regardless of origin" " ", size(large) span)
ytitle(" ""(Scale: Number of repairs)" "", size(small))
name(rep78_origin_N3 , replace) ;
graph save rep78_origin_N3, replace;
#delimit cr
I guess some of the problem is caused by me using the - relabel - option in graph hbar.
However, I would not like to have the graphs show simply the variable names, since these are not descriptive enough for my audience
Any help on this matter is greatly appreciated. Thank you all so much in advance.
Best wishes,
Hilde Johanne
Comment