tabplot for likert scale questions

Sonnen Blume

Join Date: Aug 2018
Posts: 342

tabplot for likert scale questions

21 Oct 2018, 17:30

Using Stata 14, I have data on frequency (0Never, 1Often, 2Sometimes) of consuming 4 types of tea (Green, Black, Oolong, Puerh) across Region (10 types)

[CODE]

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(Age GreenTea  BlackTea  OolongTea  PuerTea  Grade) float s byte Region
44 0 0 0 0 6 2  2
31 0 0 0 0 4 2  1
38 0 0 0 0 5 2  1
30 0 2 0 0 4 2  1
19 2 2 2 2 1 2  1
31 0 0 0 0 4 2  1
38 1 2 0 1 5 2  1
47 2 2 0 0 7 2  6
36 0 0 0 0 5 2  6
39 0 0 0 0 5 2 10
43 0 0 0 0 6 2 10
44 0 0 0 0 6 2 10
45 2 2 2 2 7 2 11
38 0 0 0 0 5 2 10
27 0 0 0 0 3 2 10
24 0 0 0 0 2 2  9
39 0 0 0 0 5 2  9
27 0 0 0 0 3 2  1
21 1 2 2 2 2 2  1
27 0 0 0 0 3 2  1
41 2 0 0 0 6 2  1
40 0 0 0 0 6 2  3
21 0 0 0 0 2 2  3
41 0 0 0 0 6 2  3
33 0 0 0 0 4 2  3
25 0 0 0 0 3 2  3
20 0 2 0 0 2 2 13
23 0 0 0 0 2 2 13
28 0 0 0 0 3 2  4
30 2 2 2 2 4 2  4
37 0 2 0 0 5 2  4
33 0 0 0 0 4 2 12
41 0 0 0 0 6 2 12
28 0 0 0 0 3 2 12
28 0 0 0 0 3 2  2
39 2 2 1 2 5 2  2
43 2 2 2 2 6 2  2
38 0 0 0 0 5 2  2
38 0 0 0 0 5 2 13
28 0 0 0 0 3 2 13
24 0 0 0 0 2 2 13
34 0 0 0 0 4 2  6
37 0 0 0 0 5 2  6
24 0 0 0 0 2 2  6
26 0 0 0 0 3 2  3
47 1 1 0 2 7 2  3
34 1 1 1 1 4 2  3
29 0 0 0 0 3 2  9
21 0 0 0 0 2 2  9
27 0 0 0 0 3 2  9
38 0 0 0 0 5 2  4
26 0 0 0 0 3 2  4
43 0 0 0 0 6 2  7
22 0 0 0 0 2 2  7
37 0 2 2 0 5 2 13
33 0 0 0 0 4 2 13
35 2 0 2 2 5 2 13
32 2 2 0 0 4 2 13
17 0 0 0 0 1 2 13
36 0 2 0 0 5 2 13
31 0 0 0 0 4 2  2
43 0 0 0 0 6 2 10
37 0 0 0 0 5 2 10
41 0 0 0 0 6 2 10
41 0 0 2 2 6 2 10
48 0 0 0 0 7 2 10
31 0 0 0 0 4 2 11
26 0 0 0 0 3 2 11
20 0 0 0 0 2 2  5
31 0 2 0 0 4 2  5
42 0 2 2 2 6 2  5
24 0 2 2 2 2 2  5
40 0 0 0 0 6 2  5
33 0 2 2 0 4 2  5
31 0 0 0 0 4 2  2
34 0 0 0 0 4 2  2
34 0 0 0 0 4 2  2
41 0 0 0 0 6 2  2
32 0 0 0 0 4 2  6
30 0 0 0 0 4 2  6
34 0 0 0 0 4 2  6
46 0 0 0 0 7 2  6
29 0 0 0 0 3 2  6
25 0 0 0 0 3 2 13
39 0 0 0 0 5 2 13
31 0 0 0 0 4 2 13
28 0 0 0 0 3 2  6
49 0 0 0 0 7 2  6
40 0 0 0 0 6 2  6
34 0 0 0 0 4 2  6
30 0 0 0 0 4 2 10
36 0 0 0 0 5 2 10
28 0 0 0 0 3 2  3
48 0 0 0 0 7 2  3
40 0 0 0 0 6 2  3
29 0 0 0 0 3 2  3
25 0 0 0 0 3 2  3
36 0 0 0 0 5 2  3
19 0 0 0 0 1 2  3
40 2 0 0 0 6 2 12
end

I want to create a graph like one below to show the percentage of never consuming the 4 types of tea by Region. On the following graph, the types of care correspond to the types of tea on x-axis and the disease conditions correspond to Region on y-axis.

Click image for larger version

Name: stt.png
Views: 1
Size: 57.2 KB
ID: 1466872

Tags: None

Chen Samulsion

Join Date: Jan 2018
Posts: 924

21 Oct 2018, 19:07

-tabplot- is from SSC.
If I understand your question correctly, you can follow codes as below:
[1] firstly, you should reshape your data so that a single variable tea can be generated;
[2] secondly, you should contract your data to get frequency of never consuming of each type of tea at each region;
[3] and then with percent of never consuming at each region as weight, you can use -tabplot- to achieve what you want.

Code:

rename _all, lower
rename *tea tea#, addnumber
gen id=_n

reshape long tea, i(id) j(type)
label define type 1 GreenTea 2 BlackTea 3 OolongTea 4 PuerTea
label values type type
label define tea 0 Never 1 Often 2 Sometimes
label values tea tea
label var type "types of tea"
label var tea "consuming: likert 3"

contract tea type region
bysort region: egen N=sum(_freq)
bysort region: gen perc=_freq/N*100
tabplot region type if tea==0 [aweight=perc], bfcolor(none) horizontal barw(1) showval(_freq) ///
 subtitle(never consuming % at each region) xsc(r(0.8)) scheme(s1color)

Comment

Nick Cox

Join Date: Mar 2014
Posts: 35720

21 Oct 2018, 19:12

tabplot is from the Stata Journal (as you are asked to explain: Stata FAQ Advice #12)

You should give a source for your example. Your data seem a long way from the layout you need.

Code:

rename (*Tea) (Tea*) 
gen long obs = _n 
drop Age s 
reshape long Tea, i(obs) j(type) string 
egen percent = mean(100 * (Tea == 0)), by(type Region)
collapse percent, by(type Region)
tabplot type Region [iw=percent], scheme(s1color) bfcolor(green*0.1) blcolor(green) showval(format(%3.0f))

Comment

Sonnen Blume

Join Date: Aug 2018

Posts: 342
#4

22 Oct 2018, 13:24

Originally posted by Chen Samulsion View Post

-tabplot- is from SSC.
If I understand your question correctly, you can follow codes as below:
[1] firstly, you should reshape your data so that a single variable tea can be generated;
[2] secondly, you should contract your data to get frequency of never consuming of each type of tea at each region;
[3] and then with percent of never consuming at each region as weight, you can use -tabplot- to achieve what you want.

Code:

rename _all, lower rename *tea tea#, addnumber gen id=_n reshape long tea, i(id) j(type) label define type 1 GreenTea 2 BlackTea 3 OolongTea 4 PuerTea label values type type label define tea 0 Never 1 Often 2 Sometimes label values tea tea label var type "types of tea" label var tea "consuming: likert 3" contract tea type region bysort region: egen N=sum(_freq) bysort region: gen perc=_freq/N*100 tabplot region type if tea==0 [aweight=perc], bfcolor(none) horizontal barw(1) showval(_freq) /// subtitle(never consuming % at each region) xsc(r(0.8)) scheme(s1color)

Thanks so much Chen. The steps are really informative, learnt some new functions in addition to getting the graph, much appreciated!
All the steps are comprehensible, except the bysort commands.

Code:

bysort region: egen N=sum(_freq) bysort region: gen perc=_freq/N*100

If you do not mind, please give a little clue to what these steps are doing. Thanks.
Comment
Sonnen Blume

Join Date: Aug 2018

Posts: 342
#5

22 Oct 2018, 13:45

Originally posted by Nick Cox View Post

tabplot is from the Stata Journal (as you are asked to explain: Stata FAQ Advice #12)

You should give a source for your example. Your data seem a long way from the layout you need.

Code:

rename (*Tea) (Tea*) gen long obs = _n drop Age s reshape long Tea, i(obs) j(type) string egen percent = mean(100 * (Tea == 0)), by(type Region) collapse percent, by(type Region) tabplot type Region [iw=percent], scheme(s1color) bfcolor(green*0.1) blcolor(green) showval(format(%3.0f))

Thanks so much Professor, mind blowing codes as always. Attached is the graph I got by adding some of your existing recipes:

Code:

tabplot Region type [iw=Percent], scheme(s1color) bfcolor(green*0.1) blcolor(green) showval(offset(.4) format(%3.0f)) horiz separate( type ) bcolor(" 27 151 119" "217 95 2" "117 112 179" "117 112 179")

I have an additional relevant question please from an earlier post (https://www.statalist.org/forums/for...lot-or-tabplot) if you do not mind. There you have created 20 questions, each of which contains 5-item responses, which I see as kind of questions-within-a-question, or variables-within-a-variable.

Code:

clear set obs 4000 egen question = seq(), to(10) block(100) egen setting = seq(), to(4) block(1000) set seed 2803

Is it possible to do so with my data, I mean merging 4 types of tea into one and then graphing all the categories of consuming tea (never, sometimes, always), which would allow me to make the graph you have shown on the other post:

Code:

catplot answer, by(question)
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 924
#6

22 Oct 2018, 16:50

Sonnen Blume, bysort varlist: repeats the command for each group of observations for which the values of the variables in varlist are the same. When you type

Code:

bysort region: egen N=sum(_freq)

you will get total count (i.e. frequence) of different consuming patterns of each region. And please note the difference between Nick and I when calculate this frequence of patterns of each region.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35720
#7

23 Oct 2018, 02:53

In #5 the values go from 79 to 95 and most are very similar. But the complementary Yes values would go from 5 to 21 and perhaps show a more interesting graph.

I am not sure what you're asking. You are mixing in references to your own data, somebody else's data and random data. Best to focus on your own. You can do things like

Code:

tabplot Tea which, by(Region) percent(Region which) tabplot Tea which, by(Region) percent(Region which) yasis

after the reshape in #3. Or just ignore what you want to,.
2 likes
Comment
Sonnen Blume

Join Date: Aug 2018

Posts: 342
#8

24 Oct 2018, 11:23

Originally posted by Nick Cox View Post

In #5 the values go from 79 to 95 and most are very similar. But the complementary Yes values would go from 5 to 21 and perhaps show a more interesting graph.

I am not sure what you're asking. You are mixing in references to your own data, somebody else's data and random data. Best to focus on your own. You can do things like

Code:

tabplot Tea which, by(Region) percent(Region which) tabplot Tea which, by(Region) percent(Region which) yasis

after the reshape in #3. Or just ignore what you want to,.

Thanks a lot professor. I am still trying to find the best design for my graph and thats why browsing the past sources. I found

Code:

catplot answer, by(question)

the most efficient way to graph my data, and working towards that. I will keep updating the thread.
Comment

Announcement

tabplot for likert scale questions

Comment

Comment

Comment

Comment

Comment

Comment

Comment