/*** Dear statalisters ***/
/*First of all: Any help on this matter is greatly appreciated. I feel I should solve this problem myself since the error feels kind of "obvious", but sadly my logical thinking is limited.
In order to make it easier for anyone willing to help me, I have written this post so that everything can be copied and pasted into a stata do-file editor (at least I hope I am helping, but maybe it is more trouble than anything else)*/
/*My problem is this: I am unable to figure out how to mend my code so that I
get the correct number of observations reported in parentheses (on the left hand side of the graph bars).*/
//******* MY CODE (re-written to be illustrated by use of Stata's auto.dta, since my own data is from a Norwegian student survey on quality in higher education) ******//
sysuse auto, clear
/*In the second and third line of code I am using tabstat to show you the numbers that I want to include in the graph.
More specifically, for each value (1 - 5) of rep78 I need the mean values for the variables mpg and turn.
In addition, I also need the number of observations for each value of rep78 for each of the two variables (mpg and turn)
N is to be reported in the parenteses to the left side of the bars, and mean values are to be reported to the right side (outside) of the bars*/
set more off
bys rep78: tabstat mpg turn, stat(mean N) col(stat) format(%12.1f)
/*As tabstat shows, when rep78 = 5 the variable mpg has a mean of 27.4 and N =11, and turn has a mean of 35.6 and N=11
(by the way, in my own data N differs between the two variables, so I need to ask for separate Ns)
Similarly, when rep78 = 4 the variable mpg has a mean of 21.7 and N=18, and turn has a mean of 38.5 and N=18.
Of course, in a similar fashion I also want these numbers reported separately for rep78 = 1, rep78=2 and rep78=3*/
//***** my problem ****//
/*It seems where my code goes wrong has to do with which r(N)'s I succeed in asking Stata to keep in memory (return list).
The way I have written the code has Stata report the correct means for each variable in each group of rep78,
but the wrong N for rep78-values 1, 2, 3 and 4. Somehow (because of my lack of coding-experience) I have Stata "remembering" only N for rep78=5,
which shows up in the graph as N=11 for all values of rep78 for each of the two variables (mpg and turn). What I want is the correct N's for each value of rep78 (for mpg and turn).*/
//THE CODE:
set more off
foreach var of varlist mpg turn {
forvalues i = 1(1)5 {
su `var' if rep78 == `i'
return list
loc f`var' `"`r(N)'"'
}
}
#delimit ;
graph hbar (mean) mpg turn,
over(rep78, relabel(1"1 repair" 2"2 repairs" 3"3 repairs" 4"4 repairs" 5"5 repairs" )
gap(*2.5) label(labcolor(gs1)))
showyvars
yvaroptions(relabel(1 "mpg(n=`fmpg')" 2 "turn(n=`fturn')" 3 "mpg(n=`fnoe12')"
4 "turn(n=`fnoe13')" 5 "mpg(n=`fnoe12')" 6 "turn(n=`fnoe13')" 7 "mpg(n=`fnoe12')" 8"turn(n=`fnoe13')" 9"mpg(n=`fnoe12')")
gap(*1.5) label(labcolor(black) labsize(small)))
bar(1, fcolor(eggshell)) bar(2, fcolor(ltkhaki)) bar(3, fcolor(olive_teal)) bar(4, fcolor(bluishgray)) bar(5, fcolor(ltblue))
bar(6, fcolor(emidblue)) bar(7, fcolor(erose)) bar(8, fcolor(sandb)) bar(9, fcolor(dkorange))
blabel(bar, pos(outside) format(%12.1f))
ysize(3) yla(1(5)45)
exclude0
legend(off)
plotregion(lcolor(none))
scheme(s1mono)
title("mpg and turn by repair records..."" ", size(medlarge) span)
ytitle(" " "Scale: ....." " ",
size(small))
note("Note: .....", size(small) span)
name( test_auto, replace);
graph save test_auto, replace;
#delimit cr
/*Any help on my problem is very much appreciated*/
/*Best wishes, Johanne*/
/*First of all: Any help on this matter is greatly appreciated. I feel I should solve this problem myself since the error feels kind of "obvious", but sadly my logical thinking is limited.
In order to make it easier for anyone willing to help me, I have written this post so that everything can be copied and pasted into a stata do-file editor (at least I hope I am helping, but maybe it is more trouble than anything else)*/
/*My problem is this: I am unable to figure out how to mend my code so that I
get the correct number of observations reported in parentheses (on the left hand side of the graph bars).*/
//******* MY CODE (re-written to be illustrated by use of Stata's auto.dta, since my own data is from a Norwegian student survey on quality in higher education) ******//
sysuse auto, clear
/*In the second and third line of code I am using tabstat to show you the numbers that I want to include in the graph.
More specifically, for each value (1 - 5) of rep78 I need the mean values for the variables mpg and turn.
In addition, I also need the number of observations for each value of rep78 for each of the two variables (mpg and turn)
N is to be reported in the parenteses to the left side of the bars, and mean values are to be reported to the right side (outside) of the bars*/
set more off
bys rep78: tabstat mpg turn, stat(mean N) col(stat) format(%12.1f)
/*As tabstat shows, when rep78 = 5 the variable mpg has a mean of 27.4 and N =11, and turn has a mean of 35.6 and N=11
(by the way, in my own data N differs between the two variables, so I need to ask for separate Ns)
Similarly, when rep78 = 4 the variable mpg has a mean of 21.7 and N=18, and turn has a mean of 38.5 and N=18.
Of course, in a similar fashion I also want these numbers reported separately for rep78 = 1, rep78=2 and rep78=3*/
//***** my problem ****//
/*It seems where my code goes wrong has to do with which r(N)'s I succeed in asking Stata to keep in memory (return list).
The way I have written the code has Stata report the correct means for each variable in each group of rep78,
but the wrong N for rep78-values 1, 2, 3 and 4. Somehow (because of my lack of coding-experience) I have Stata "remembering" only N for rep78=5,
which shows up in the graph as N=11 for all values of rep78 for each of the two variables (mpg and turn). What I want is the correct N's for each value of rep78 (for mpg and turn).*/
//THE CODE:
set more off
foreach var of varlist mpg turn {
forvalues i = 1(1)5 {
su `var' if rep78 == `i'
return list
loc f`var' `"`r(N)'"'
}
}
#delimit ;
graph hbar (mean) mpg turn,
over(rep78, relabel(1"1 repair" 2"2 repairs" 3"3 repairs" 4"4 repairs" 5"5 repairs" )
gap(*2.5) label(labcolor(gs1)))
showyvars
yvaroptions(relabel(1 "mpg(n=`fmpg')" 2 "turn(n=`fturn')" 3 "mpg(n=`fnoe12')"
4 "turn(n=`fnoe13')" 5 "mpg(n=`fnoe12')" 6 "turn(n=`fnoe13')" 7 "mpg(n=`fnoe12')" 8"turn(n=`fnoe13')" 9"mpg(n=`fnoe12')")
gap(*1.5) label(labcolor(black) labsize(small)))
bar(1, fcolor(eggshell)) bar(2, fcolor(ltkhaki)) bar(3, fcolor(olive_teal)) bar(4, fcolor(bluishgray)) bar(5, fcolor(ltblue))
bar(6, fcolor(emidblue)) bar(7, fcolor(erose)) bar(8, fcolor(sandb)) bar(9, fcolor(dkorange))
blabel(bar, pos(outside) format(%12.1f))
ysize(3) yla(1(5)45)
exclude0
legend(off)
plotregion(lcolor(none))
scheme(s1mono)
title("mpg and turn by repair records..."" ", size(medlarge) span)
ytitle(" " "Scale: ....." " ",
size(small))
note("Note: .....", size(small) span)
name( test_auto, replace);
graph save test_auto, replace;
#delimit cr
/*Any help on my problem is very much appreciated*/
/*Best wishes, Johanne*/
Comment