Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • tabplot for likert scale questions

    Using Stata 14, I have data on frequency (0Never, 1Often, 2Sometimes) of consuming 4 types of tea (Green, Black, Oolong, Puerh) across Region (10 types)

    [CODE]
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(Age GreenTea  BlackTea  OolongTea  PuerTea  Grade) float s byte Region
    44 0 0 0 0 6 2  2
    31 0 0 0 0 4 2  1
    38 0 0 0 0 5 2  1
    30 0 2 0 0 4 2  1
    19 2 2 2 2 1 2  1
    31 0 0 0 0 4 2  1
    38 1 2 0 1 5 2  1
    47 2 2 0 0 7 2  6
    36 0 0 0 0 5 2  6
    39 0 0 0 0 5 2 10
    43 0 0 0 0 6 2 10
    44 0 0 0 0 6 2 10
    45 2 2 2 2 7 2 11
    38 0 0 0 0 5 2 10
    27 0 0 0 0 3 2 10
    24 0 0 0 0 2 2  9
    39 0 0 0 0 5 2  9
    27 0 0 0 0 3 2  1
    21 1 2 2 2 2 2  1
    27 0 0 0 0 3 2  1
    41 2 0 0 0 6 2  1
    40 0 0 0 0 6 2  3
    21 0 0 0 0 2 2  3
    41 0 0 0 0 6 2  3
    33 0 0 0 0 4 2  3
    25 0 0 0 0 3 2  3
    20 0 2 0 0 2 2 13
    23 0 0 0 0 2 2 13
    28 0 0 0 0 3 2  4
    30 2 2 2 2 4 2  4
    37 0 2 0 0 5 2  4
    33 0 0 0 0 4 2 12
    41 0 0 0 0 6 2 12
    28 0 0 0 0 3 2 12
    28 0 0 0 0 3 2  2
    39 2 2 1 2 5 2  2
    43 2 2 2 2 6 2  2
    38 0 0 0 0 5 2  2
    38 0 0 0 0 5 2 13
    28 0 0 0 0 3 2 13
    24 0 0 0 0 2 2 13
    34 0 0 0 0 4 2  6
    37 0 0 0 0 5 2  6
    24 0 0 0 0 2 2  6
    26 0 0 0 0 3 2  3
    47 1 1 0 2 7 2  3
    34 1 1 1 1 4 2  3
    29 0 0 0 0 3 2  9
    21 0 0 0 0 2 2  9
    27 0 0 0 0 3 2  9
    38 0 0 0 0 5 2  4
    26 0 0 0 0 3 2  4
    43 0 0 0 0 6 2  7
    22 0 0 0 0 2 2  7
    37 0 2 2 0 5 2 13
    33 0 0 0 0 4 2 13
    35 2 0 2 2 5 2 13
    32 2 2 0 0 4 2 13
    17 0 0 0 0 1 2 13
    36 0 2 0 0 5 2 13
    31 0 0 0 0 4 2  2
    43 0 0 0 0 6 2 10
    37 0 0 0 0 5 2 10
    41 0 0 0 0 6 2 10
    41 0 0 2 2 6 2 10
    48 0 0 0 0 7 2 10
    31 0 0 0 0 4 2 11
    26 0 0 0 0 3 2 11
    20 0 0 0 0 2 2  5
    31 0 2 0 0 4 2  5
    42 0 2 2 2 6 2  5
    24 0 2 2 2 2 2  5
    40 0 0 0 0 6 2  5
    33 0 2 2 0 4 2  5
    31 0 0 0 0 4 2  2
    34 0 0 0 0 4 2  2
    34 0 0 0 0 4 2  2
    41 0 0 0 0 6 2  2
    32 0 0 0 0 4 2  6
    30 0 0 0 0 4 2  6
    34 0 0 0 0 4 2  6
    46 0 0 0 0 7 2  6
    29 0 0 0 0 3 2  6
    25 0 0 0 0 3 2 13
    39 0 0 0 0 5 2 13
    31 0 0 0 0 4 2 13
    28 0 0 0 0 3 2  6
    49 0 0 0 0 7 2  6
    40 0 0 0 0 6 2  6
    34 0 0 0 0 4 2  6
    30 0 0 0 0 4 2 10
    36 0 0 0 0 5 2 10
    28 0 0 0 0 3 2  3
    48 0 0 0 0 7 2  3
    40 0 0 0 0 6 2  3
    29 0 0 0 0 3 2  3
    25 0 0 0 0 3 2  3
    36 0 0 0 0 5 2  3
    19 0 0 0 0 1 2  3
    40 2 0 0 0 6 2 12
    end

    I want to create a graph like one below to show the percentage of never consuming the 4 types of tea by Region. On the following graph, the types of care correspond to the types of tea on x-axis and the disease conditions correspond to Region on y-axis.

    Click image for larger version

Name:	stt.png
Views:	1
Size:	57.2 KB
ID:	1466872

  • #2
    -tabplot- is from SSC.
    If I understand your question correctly, you can follow codes as below:
    [1] firstly, you should reshape your data so that a single variable tea can be generated;
    [2] secondly, you should contract your data to get frequency of never consuming of each type of tea at each region;
    [3] and then with percent of never consuming at each region as weight, you can use -tabplot- to achieve what you want.
    Code:
    rename _all, lower
    rename *tea tea#, addnumber
    gen id=_n
    
    reshape long tea, i(id) j(type)
    label define type 1 GreenTea 2 BlackTea 3 OolongTea 4 PuerTea
    label values type type
    label define tea 0 Never 1 Often 2 Sometimes
    label values tea tea
    label var type "types of tea"
    label var tea "consuming: likert 3"
    
    contract tea type region
    bysort region: egen N=sum(_freq)
    bysort region: gen perc=_freq/N*100
    tabplot region type if tea==0 [aweight=perc], bfcolor(none) horizontal barw(1) showval(_freq) ///
     subtitle(never consuming % at each region) xsc(r(0.8)) scheme(s1color)

    Comment


    • #3
      tabplot is from the Stata Journal (as you are asked to explain: Stata FAQ Advice #12)

      You should give a source for your example. Your data seem a long way from the layout you need.

      Code:
      rename (*Tea) (Tea*) 
      gen long obs = _n 
      drop Age s 
      reshape long Tea, i(obs) j(type) string 
      egen percent = mean(100 * (Tea == 0)), by(type Region)
      collapse percent, by(type Region)
      tabplot type Region [iw=percent], scheme(s1color) bfcolor(green*0.1) blcolor(green) showval(format(%3.0f))

      Comment


      • #4
        Originally posted by Chen Samulsion View Post
        -tabplot- is from SSC.
        If I understand your question correctly, you can follow codes as below:
        [1] firstly, you should reshape your data so that a single variable tea can be generated;
        [2] secondly, you should contract your data to get frequency of never consuming of each type of tea at each region;
        [3] and then with percent of never consuming at each region as weight, you can use -tabplot- to achieve what you want.
        Code:
        rename _all, lower
        rename *tea tea#, addnumber
        gen id=_n
        
        reshape long tea, i(id) j(type)
        label define type 1 GreenTea 2 BlackTea 3 OolongTea 4 PuerTea
        label values type type
        label define tea 0 Never 1 Often 2 Sometimes
        label values tea tea
        label var type "types of tea"
        label var tea "consuming: likert 3"
        
        contract tea type region
        bysort region: egen N=sum(_freq)
        bysort region: gen perc=_freq/N*100
        tabplot region type if tea==0 [aweight=perc], bfcolor(none) horizontal barw(1) showval(_freq) ///
        subtitle(never consuming % at each region) xsc(r(0.8)) scheme(s1color)
        Thanks so much Chen. The steps are really informative, learnt some new functions in addition to getting the graph, much appreciated!
        All the steps are comprehensible, except the bysort commands.
        Code:
        bysort region: egen N=sum(_freq)
        bysort region: gen perc=_freq/N*100
        If you do not mind, please give a little clue to what these steps are doing. Thanks.

        Comment


        • #5
          Originally posted by Nick Cox View Post
          tabplot is from the Stata Journal (as you are asked to explain: Stata FAQ Advice #12)

          You should give a source for your example. Your data seem a long way from the layout you need.

          Code:
          rename (*Tea) (Tea*)
          gen long obs = _n
          drop Age s
          reshape long Tea, i(obs) j(type) string
          egen percent = mean(100 * (Tea == 0)), by(type Region)
          collapse percent, by(type Region)
          tabplot type Region [iw=percent], scheme(s1color) bfcolor(green*0.1) blcolor(green) showval(format(%3.0f))
          Thanks so much Professor, mind blowing codes as always. Attached is the graph I got by adding some of your existing recipes:
          Code:
          tabplot   Region type [iw=Percent], scheme(s1color) bfcolor(green*0.1) blcolor(green) showval(offset(.4) format(%3.0f)) horiz  separate( type )  bcolor(" 27 151 119" "217 95 2" "117 112 179" "117 112 179")
          Click image for larger version

Name:	St1.JPG
Views:	1
Size:	250.5 KB
ID:	1467038


          I have an additional relevant question please from an earlier post (https://www.statalist.org/forums/for...lot-or-tabplot) if you do not mind. There you have created 20 questions, each of which contains 5-item responses, which I see as kind of questions-within-a-question, or variables-within-a-variable.
          Code:
           
           clear set obs 4000 egen question = seq(), to(10) block(100) egen setting = seq(), to(4) block(1000) set seed 2803
          Is it possible to do so with my data, I mean merging 4 types of tea into one and then graphing all the categories of consuming tea (never, sometimes, always), which would allow me to make the graph you have shown on the other post:
          Code:
          catplot answer, by(question)
          Click image for larger version

Name:	st2.JPG
Views:	1
Size:	182.9 KB
ID:	1467039

          Comment


          • #6
            Sonnen Blume, bysort varlist: repeats the command for each group of observations for which the values of the variables in varlist are the same. When you type
            Code:
            bysort region: egen N=sum(_freq)
            you will get total count (i.e. frequence) of different consuming patterns of each region. And please note the difference between Nick and I when calculate this frequence of patterns of each region.

            Comment


            • #7
              In #5 the values go from 79 to 95 and most are very similar. But the complementary Yes values would go from 5 to 21 and perhaps show a more interesting graph.

              I am not sure what you're asking. You are mixing in references to your own data, somebody else's data and random data. Best to focus on your own. You can do things like

              Code:
              tabplot Tea which, by(Region) percent(Region which)
              
              tabplot Tea which, by(Region) percent(Region which) yasis
              after the reshape in #3. Or just ignore what you want to,.

              Comment


              • #8
                Originally posted by Nick Cox View Post
                In #5 the values go from 79 to 95 and most are very similar. But the complementary Yes values would go from 5 to 21 and perhaps show a more interesting graph.

                I am not sure what you're asking. You are mixing in references to your own data, somebody else's data and random data. Best to focus on your own. You can do things like

                Code:
                tabplot Tea which, by(Region) percent(Region which)
                
                tabplot Tea which, by(Region) percent(Region which) yasis
                after the reshape in #3. Or just ignore what you want to,.
                Thanks a lot professor. I am still trying to find the best design for my graph and thats why browsing the past sources. I found
                Code:
                 
                 catplot answer, by(question)
                the most efficient way to graph my data, and working towards that. I will keep updating the thread.

                Comment

                Working...
                X