Store results of a statsby for use in a graph

Addie Sutton

Join Date: Mar 2023
Posts: 2

Store results of a statsby for use in a graph

05 Mar 2023, 20:37

Hi, I am trying to create a graph that depicts average age that a child gets their first phone by a demographic variable (in this example, race). I am using Stata 17 and have used statsby to collect the mean age of first phone, lower bound, upperbound, and number of observations in each race category.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float id long race_cat float floor_age_first_sp
  1 3  .
  2 3 12
  3 3 12
  4 4  6
  5 3 12
  6 3  .
  7 1  7
  8 4 13
  9 1 11
 10 1 10
 11 1 13
 12 1  7
 13 1 12
 14 1 15
 15 3 12
 16 1 13
 17 1 12
 18 3  7
 19 3  .
 20 . 11
 21 3 11
 22 1 13
 23 3 12
 24 3  .
 25 1 12
 26 1 12
 27 1  8
 28 3  .
 29 1  9
 30 4  .
 31 3  .
 32 1 10
 33 . 12
 34 3  .
 35 3  .
 36 3  .
 37 3  9
 38 1  6
 39 1 11
 40 3 11
 41 3 12
 42 3  .
 43 1  9
 44 3  .
 45 3  .
 46 3  9
 47 1  .
 48 4  .
 49 1 13
 50 3  .
 51 3 14
 52 3 12
 53 1  4
 54 4  .
 55 1  .
 56 3 12
 57 1 12
 58 4 11
 59 . 12
 60 1  8
 61 3  .
 62 3 13
 63 1  .
 64 1  .
 65 4  9
 66 3  .
 67 1 11
 68 1 10
 69 3 12
 70 4 13
 71 3 13
 72 4 12
 73 3 11
 74 4  .
 75 3  8
 76 3 10
 77 4 10
 78 3 13
 79 4 12
 80 1 12
 81 1  .
 82 1 13
 83 3 10
 84 1 11
 85 3  .
 86 1 12
 87 1 13
 88 4 11
 89 1 15
 90 3  .
 91 1 13
 92 .  .
 93 3  .
 94 3  9
 95 3  .
 96 3  .
 97 3  .
 98 4 11
 99 4 12
100 3 11
end
label values race_cat enc_race
label def enc_race 1 "Black", modify
label def enc_race 3 "Hispanic", modify
label def enc_race 4 "White", modify

statsby , by(race_cat) clear : ci means floor_age_first_sp
list

twoway rspike lb ub race_cat , || scatter mean race_cat ///
,title("Age child received first smartphone" ,size(medium)) subtitle("{it:by race}" ,size(small))  legend(off) ///
xla(1 2 3 ,val) xsc(r(.8 3.2)) xtitle("race of child" ,size(small)) ///
ysc(r(9 12)) ytitle("age child received first smartphone" ,size(small)) ///
text(9.92038 1 "N = 31" ,place(s)) /// # of observations in group 1 (Black) below lowerbound
text(10.39561 2 "N = 24" ,place(s)) ///
text(9.55029 3 "N = 11" ,place(s))

di `r(N)' //nothing happens

I noticed that running the statsby line produces:
(running ci on estimation sample)

Command: ci means floor_age_first_sp
N: r(N)
mean: r(mean)
se: r(se)
lb: r(lb)
ub: r(ub)
level: r(level)
By: race_cat

I was hoping to be able to reference the lowerbounds and Ns so that I did not have to input them by hand each time. Something like this:

Code:

twoway rspike lb ub race_cat , || scatter mean race_cat ///
,title("Age child received first smartphone" ,size(medium)) subtitle("{it:by race}" ,size(small))  legend(off) ///
xla(1 2 3 ,val) xsc(r(.8 3.2)) xtitle("race of child" ,size(small)) ///
ysc(r(9 12)) ytitle("age child received first smartphone" ,size(small)) ///
text(`r(lb) in 1' 1 "N = `r(N) in 1'" ,place(s)) /// # of observations in group 1 (Black) below lowerbound
text(`r(lb) in 2' 2 "N = `r(N) in 2'" ,place(s)) ///
text(`r(lb) in 3' 3 "N = `r(N) in 3" ,place(s))

Doing it by hand is not the end of the world, but it would make my life so much easier to have it done automatically! Essentially, I want the N for each group placed just below the lower bound for that group.

This is my first post here, so I apologize for any syntax errors or faux pas! Please point them out if I have committed any.

Thank you!

Last edited by Addie Sutton; 05 Mar 2023, 20:39.

Tags: None

Nick Cox

Join Date: Mar 2014
Posts: 36058

06 Mar 2023, 01:09

Thanks for using dataex on your first post.

r(N) is produced by ci in this case but the problem with your idea is that the only r(N) visible after statsby is the last one calculated and that needs to be accessed immediately. But the information you need to use is in the variable N which gives a route to using it.

With your data example, I would do something like this:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float id long race_cat float floor_age_first_sp
  1 3  .
  2 3 12
  3 3 12
  4 4  6
  5 3 12
  6 3  .
  7 1  7
  8 4 13
  9 1 11
 10 1 10
 11 1 13
 12 1  7
 13 1 12
 14 1 15
 15 3 12
 16 1 13
 17 1 12
 18 3  7
 19 3  .
 20 . 11
 21 3 11
 22 1 13
 23 3 12
 24 3  .
 25 1 12
 26 1 12
 27 1  8
 28 3  .
 29 1  9
 30 4  .
 31 3  .
 32 1 10
 33 . 12
 34 3  .
 35 3  .
 36 3  .
 37 3  9
 38 1  6
 39 1 11
 40 3 11
 41 3 12
 42 3  .
 43 1  9
 44 3  .
 45 3  .
 46 3  9
 47 1  .
 48 4  .
 49 1 13
 50 3  .
 51 3 14
 52 3 12
 53 1  4
 54 4  .
 55 1  .
 56 3 12
 57 1 12
 58 4 11
 59 . 12
 60 1  8
 61 3  .
 62 3 13
 63 1  .
 64 1  .
 65 4  9
 66 3  .
 67 1 11
 68 1 10
 69 3 12
 70 4 13
 71 3 13
 72 4 12
 73 3 11
 74 4  .
 75 3  8
 76 3 10
 77 4 10
 78 3 13
 79 4 12
 80 1 12
 81 1  .
 82 1 13
 83 3 10
 84 1 11
 85 3  .
 86 1 12
 87 1 13
 88 4 11
 89 1 15
 90 3  .
 91 1 13
 92 .  .
 93 3  .
 94 3  9
 95 3  .
 96 3  .
 97 3  .
 98 4 11
 99 4 12
100 3 11
end
label values race_cat enc_race
label def enc_race 1 "Black", modify
label def enc_race 3 "Hispanic", modify
label def enc_race 4 "White", modify

statsby , by(race_cat) clear : ci means floor_age_first_sp
list

gen where = 9.2 
gen toshow = "{it:N} = " + strofreal(N)

twoway rspike lb ub race_cat , || scatter mean race_cat ///
,title("Age child received first smartphone" ,size(medium)) subtitle("{it:by race}" ,size(small))  legend(off) ///
xla(1 2 3 4,val) yla(, ang(h)) xsc(r(.8 4.2)) xtitle("race of child" ,size(small)) ///
ytitle("age child received first smartphone" ,size(small)) ///
|| scatter where race_cat, ms(none) mla(toshow) mlabpos(0) mlabsize(medium)

Click image for larger version

Name: smartphone.png
Views: 1
Size: 17.5 KB
ID: 1704533

The non-appearance of race_cat 2 is just a side-effect of your data example. You have scope to vary the vertical position of the extra text by making it depend on lb. My wild guess is that most readers would prefer my choice!

Comment

Nick Cox

Join Date: Mar 2014

Posts: 36058
#3

06 Mar 2023, 03:42

The idea of putting what you want to show in a string variable used as a marker label is also discussed in https://journals.sagepub.com/doi/pdf...6867X211063413

Detail: I would use lower case "{it:n}" but the habits of your tribe may vary.
1 like
Comment
Addie Sutton

Join Date: Mar 2023

Posts: 2
#4

06 Mar 2023, 09:42

Hi Nick, thanks so much for your help, very cool solution! I was able to follow your suggestion by changing the definition of where from 9.2 to lb-.01 but I do prefer the one you created a bit better. Thanks for linking to that article; I will give it a look!
Thanks again!
Comment

Announcement

Store results of a statsby for use in a graph

Comment

Comment

Comment