Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • One bar graph with multiple yvars, by variable

    Hello,

    I am attempting to graph the percentage of "yes" answers to four survey questions by male and female.

    Here is my current code:

    Code:
    label define colleagues 1 "Y" 0 "N"
    encode Colleagues, gen(colleagues)
    label define famfriends 1 "Y" 0 "N"
    encode FamFriends, gen(famfriends)
    label define onmyown 1 "Y" 0 "N"
    encode Onmyown, gen(onmyown)
    label define dontplan 1 "Y" 0 "N"
    encode Dontplan, gen(dontplan)
    graph bar (mean) colleagues famfriends onmyown dontplan, by(gender, note("") iytitle noytitle title("I will Apply what I learned"))
    Which is producing this graph:

    Click image for larger version

Name:	willapply.png
Views:	1
Size:	13.4 KB
ID:	1520412


    Instead of two separate graphs for female and male, I would like to graph the percentage of "yes" answers by gender next to each other on one graph.

    I've attached data of the relevant variables for the first 100 observations.

    Thank you.
    Attached Files

  • #2
    Does the following get you closer to a solution, using over, instead of by?
    Code:
    sysuse auto, clear
    graph bar length displacement, over(foreign)

    Comment


    • #3
      Thanks for the suggestion. Using over instead of by does get me one graph instead of two. See the following image.

      Click image for larger version

Name:	mean.png
Views:	1
Size:	31.3 KB
ID:	1520432


      Looking back on my post, I realize that this is not different from what I stated I was looking for. However (and this means I need to post more clearly), instead of Female and Male on the x-axis, with a separate bar for each of the means, I was hoping to have each of the 4 means on the x-axis, with 2 separate bars for each gender. Hopefully this is possible.

      On another note, I really appreciate the videos that you post on YouTube. They're great and have helped me out on numerous ocassions.

      Comment


      • #4
        Search the forum for mentions of statplot (SSC).

        Please note our longstanding advice to post Stata data examples, not spreadsheet attachments. FAQ Advice #12 explains why.

        Comment


        • #5
          To show willing, I used a computer supporting MS Excel. (Believe or not, many of the most active users here do not use spreadsheet applications). This is the data posted in #1.

          Code:
           
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str12 A str1(B C D) str6 E
          "Y"            "N" "Y" "N" "Male"  
          "Y"            "N" "N" "N" "Male"  
          "N"            "N" "Y" "N" "Null"  
          "N"            "N" "Y" "N" "Female"
          "Y"            "N" "N" "N" "Male"  
          "Y"            "Y" "Y" "N" "Male"  
          "Y"            "N" "N" "N" "Female"
          "N"            "N" "Y" "N" "Male"  
          "N"            "N" "Y" "N" "Female"
          "Y (students)" "N" "Y" "N" "Null"  
          "Y"            "N" "N" "N" "Null"  
          "Y"            "N" "N" "N" "Other" 
          "N"            "N" "Y" "N" "Female"
          "Y"            "Y" "Y" "N" "Male"  
          "Y"            "Y" "Y" "N" "Female"
          "N"            "N" "N" "Y" "Female"
          "N"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Male"  
          "N"            "N" "Y" "N" "Male"  
          "Y"            "Y" "Y" "N" "Female"
          "Y"            "Y" "Y" "N" "Male"  
          "Y"            "N" "Y" "N" "Male"  
          "Y"            "N" "N" "N" "Male"  
          "N"            "N" "Y" "N" "Null"  
          "Y"            "N" "Y" "N" "Female"
          "N"            "N" "N" "N" "Male"  
          "N"            "N" "Y" "N" "Female"
          "Y"            "N" "Y" "N" "Null"  
          "Y"            "N" "Y" "N" "Male"  
          "Y"            "Y" "N" "N" "Null"  
          "N"            "Y" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Null"  
          "Y"            "N" "N" "N" "Female"
          "Y"            "Y" "Y" "N" "Female"
          "Y"            "N" "N" "N" "Female"
          "N"            "Y" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Female"
          "Y"            "Y" "Y" "N" "Female"
          "Y"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Male"  
          "N"            "N" "N" "N" "Female"
          "N"            "Y" "Y" "N" "Female"
          "Y"            "Y" "Y" "N" "Null"  
          "N"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Male"  
          "Y"            "Y" "Y" "N" "Male"  
          "N"            "N" "Y" "N" "Female"
          "Y"            "N" "Y" "N" "Male"  
          "Y"            "N" "Y" "N" "Null"  
          "N"            "N" "Y" "N" "Female"
          "Y"            "N" "Y" "N" "Female"
          "Y"            "N" "N" "N" "Male"  
          "N"            "N" "Y" "N" "Female"
          "Y"            "N" "N" "N" "Male"  
          "N"            "Y" "Y" "N" "Female"
          "Y"            "N" "Y" "N" "Male"  
          "N"            "N" "Y" "N" "Female"
          "Y"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Female"
          "N"            "Y" "Y" "N" "Null"  
          "N"            "N" "Y" "N" "Male"  
          "Y"            "N" "Y" "N" "Female"
          "Y"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Male"  
          "N"            "N" "N" "Y" "Male"  
          "Y"            "Y" "Y" "N" "Female"
          "N"            "Y" "Y" "N" "Null"  
          "N"            "N" "Y" "N" "Male"  
          "Y"            "N" "N" "N" "Male"  
          "N"            "N" "Y" "N" "Male"  
          "Y"            "Y" "Y" "N" "Male"  
          "Y"            "Y" "Y" "N" "Male"  
          "Y"            "N" "N" "N" "Female"
          "Y"            "Y" "Y" "N" "Null"  
          "Y"            "Y" "Y" "N" "Male"  
          "N"            "N" "Y" "N" "Null"  
          "N"            "N" "N" "N" "Female"
          "Y"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Null"  
          "N"            "N" "N" "N" "Female"
          "N"            "N" "Y" "N" "Female"
          "Y"            "N" "Y" "N" "Null"  
          "Y"            "N" "Y" "N" "Female"
          "Y"            "N" "Y" "N" "Male"  
          "Y"            "Y" "Y" "N" "Null"  
          "N"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Null"  
          "N"            "N" "N" "N" "Female"
          "Y"            "N" "N" "N" "Female"
          "N"            "N" "Y" "N" "Female"
          "Y"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Female"
          "Y"            "N" "Y" "N" "Male"  
          "Y"            "N" "N" "N" "Female"
          "Y"            "N" "Y" "N" "Female"
          "N"            "N" "Y" "N" "Female"
          end

          At this point I give up. It's too much of a puzzle to relate these data to your variables. And what about that errant value in the first variable?

          As we request in the FAQ Advice #12 it is much, much better to use dataex from within Stata to show your variables.

          More positively, you can get much nicer graphs than #1. statplot is, I believe, an answer to your main question. But the default s2color scheme is not usually what anyone wants, you should use informative variable labels not semi-cryptic variable names, and you can lose the legend.

          Showing the results of

          Code:
          contract colleagues famfriends onmyown dontplan gender
          would allow detailed, constructive advice.

          Comment


          • #6
            Thank you. The results of

            Code:
            contract colleagues famfriends onmyown dontplan gender
            are

            Code:
            input long(gender colleagues famfriends onmyown dontplan) byte _freq
            1 0 0 0 0  6
            2 0 0 0 0  2
            4 0 0 0 0  1
            1 0 0 0 1  1
            2 0 0 0 1  1
            1 0 0 1 0 32
            2 0 0 1 0 11
            3 0 0 1 0  7
            1 0 1 1 0  4
            3 0 1 1 0  2
            1 1 0 0 0 12
            2 1 0 0 0 12
            3 1 0 0 0  3
            4 1 0 0 0  1
            1 1 0 1 0 15
            2 1 0 1 0  7
            3 1 0 1 0  5
            3 1 1 0 0  1
            1 1 1 1 0  8
            2 1 1 1 0 10
            3 1 1 1 0  7
            3 2 0 1 0  1

            Comment


            • #7
              gender has values 1 2 3 4 there but in #1 you have just males and females. Please explain. Should we be ignoring 3 and 4?

              Comment


              • #8
                Yes. My apologies. I am dropping values 3 and 4 which are, respectively, no answer and other.

                Comment


                • #9
                  OK. Here I may well have male and female the wrong way round but you can reverse if so.

                  I'd continue to encourage informative variable labels; cryptic names on axes are not a good idea. If you did that, you may find that horizontal bars are a better idea and/or that variable labels need to go on two lines.

                  The order of variables is up for grabs too: perhaps onmyown colleagues famfriends dontplan


                  Code:
                  clear
                  input long(gender colleagues famfriends onmyown dontplan) byte _freq
                  1 0 0 0 0  6
                  2 0 0 0 0  2
                  4 0 0 0 0  1
                  1 0 0 0 1  1
                  2 0 0 0 1  1
                  1 0 0 1 0 32
                  2 0 0 1 0 11
                  3 0 0 1 0  7
                  1 0 1 1 0  4
                  3 0 1 1 0  2
                  1 1 0 0 0 12
                  2 1 0 0 0 12
                  3 1 0 0 0  3
                  4 1 0 0 0  1
                  1 1 0 1 0 15
                  2 1 0 1 0  7
                  3 1 0 1 0  5
                  3 1 1 0 0  1
                  1 1 1 1 0  8
                  2 1 1 1 0 10
                  3 1 1 1 0  7
                  3 2 0 1 0  1
                  end
                  
                  keep if inrange(gender, 1, 2)
                  
                  expand _freq
                  
                  set scheme s1color
                  
                  label define gender 1 male 2 female
                  label val gender gender
                  
                  * need --     ssc install statplot    --   before you can use it
                  statplot colleagues-dontplan, bar(1, bfcolor(green*0.4)) over(gender) recast(bar) yla(0 0.25 "25" 0.5 "50" 0.75 "75", ang(h)) ytitle(percent yes)
                  Note: the extra option asyvars may seem a good idea. Then you can give males and females different colours.
                  Click image for larger version

Name:	saidutte.png
Views:	1
Size:	22.9 KB
ID:	1520538

                  Last edited by Nick Cox; 15 Oct 2019, 09:20.

                  Comment


                  • #10
                    Thank you. That solves my problem.

                    I'm also attempting to create a similar graph over an education variable that has three possibilities (graduate, bachelor's, associate's or less). In this case, I'm not using the recast option, as it would cause the axis-labels to overlap. My code is:

                    Code:
                    statplot colleagues-dontplan, bar(1, bfcolor(green*0.4)) over(education, label(labsize(vsmall))) ///
                    yla(0 0.25 "25" 0.5 "50" 0.75 "75", ang(h)) ytitle(percent) title("I will apply what I learned") ///
                    varopts(label(labsize(vsmall)))
                    Which produces the following graph with variables that do not fit. I find this very puzzling considering that this didn't happen the first I ran the code. Any instructions on how to compress the graph/make the text fit are appreciated. Using a smaller text size makes the variables unreadable.
                    Attached Files

                    Comment


                    • #11
                      statplot is here just a wrapper for graph hbar, and you’re seeing how it behaves with too much text on the axes. I warned you in #9 that you might need to write axis labels on two lines and the problem is compounded with two levels of predictors.

                      Again if you post the data I can have a go at writing different code, although the next two days are busy for me.

                      Comment


                      • #12
                        Thank you for the nighttime response. Please see the data:

                        Code:
                        input long(education colleagues famfriends onmyown dontplan) byte _freq
                        1 0 0 0 0  1
                        2 0 0 0 0  2
                        3 0 0 0 0  6
                        2 0 0 0 1  2
                        1 0 0 1 0  5
                        2 0 0 1 0 15
                        3 0 0 1 0 30
                        1 0 1 1 0  1
                        3 0 1 1 0  5
                        1 1 0 0 0  2
                        2 1 0 0 0  8
                        3 1 0 0 0 17
                        1 1 0 1 0  3
                        2 1 0 1 0  8
                        3 1 0 1 0 16
                        2 1 1 0 0  1
                        1 1 1 1 0  3
                        2 1 1 1 0  9
                        3 1 1 1 0 13
                        2 2 0 1 0  1

                        Comment


                        • #13
                          Thanks. Note that the last category looks spurious as there is a value of 2.

                          The main trick here is just as previously advised, splitting text in two lines. I corrected your punctuation and added "degree" for symmetry. There is surely enough space to improve on the variable names.

                          Code:
                          clear
                          input long(education colleagues famfriends onmyown dontplan) byte _freq
                          1 0 0 0 0  1
                          2 0 0 0 0  2
                          3 0 0 0 0  6
                          2 0 0 0 1  2
                          1 0 0 1 0  5
                          2 0 0 1 0 15
                          3 0 0 1 0 30
                          1 0 1 1 0  1
                          3 0 1 1 0  5
                          1 1 0 0 0  2
                          2 1 0 0 0  8
                          3 1 0 0 0 17
                          1 1 0 1 0  3
                          2 1 0 1 0  8
                          3 1 0 1 0 16
                          2 1 1 0 0  1
                          1 1 1 1 0  3
                          2 1 1 1 0  9
                          3 1 1 1 0 13
                          2 2 0 1 0  1
                          end
                          
                          label define education 1 `" "Associates" "or Less" "' 2 `" "Bachelor's" "degree" "' 3 `" "Graduate/" "Professional" "'
                          label val education education
                          
                          egen tokeep = rowmax(colleagues-dontplan)
                          
                          statplot colleagues-dontplan if inrange(tokeep, 0, 1) [fw=_freq], bar(1, bfcolor(green*0.4)) over(education, label(labsize(small))) ///
                          yla(0 0.25 "25" 0.5 "50" 0.75 "75", ang(h)) ytitle(percent) title("I will apply what I learned") ///
                          varopts(label(labsize(small))) ysize(7)
                          Click image for larger version

Name:	saidutte2.png
Views:	1
Size:	21.8 KB
ID:	1520639

                          Comment


                          • #14
                            Thank you again. I've improved the variable names. While I've been unsuccessful in splitting the value labels for education (probably a mistake on my end), I don't think that's the core problem. I'm producing 6 graphs total. The text fits for some of the graphs, while for others it runs off of the page. This seems to be completely random. I'll post the first graph, for which the text runs off the page, and the last graph, for which the text fits, despite the fact that it involves more text.

                            Code:
                            *encode Colleagues
                            label define colleagues 1 "Y" 0 "N", replace
                            encode Colleagues, gen(colleagues)
                            label define famfriends 1 "Y" 0 "N"
                            encode FamilyorFriends, gen(famfriends)
                            label define onmyown 1 "Y" 0 "N"
                            encode Onmyown, gen(onmyown)
                            label define dontplan 1 "Y" 0 "N"
                            encode Dontplantouse, gen(dontplan)
                            
                            *label colleagues
                            label variable colleagues     `" "w" "Colleagues" "'
                            label variable famfriends     `" "w Family," "Friends" "'
                            label variable onmyown        `" "On My" "Own" "'
                            label variable dontplan        `" "Don't Plan" "to Use" "'        
                            
                            *label education
                            label define education 1 `" "Associates" "or Less" "' 2 `" "Bachelors" "degree" "' 3 `" "Graduate/" "Professional" "'
                            
                            *encode Education
                            encode Education, gen(education)
                            label val education education
                            
                            preserve
                            
                            contract colleagues famfriends onmyown dontplan gender
                            
                            expand _freq
                            
                            keep if inrange(gender, 1, 2)
                            
                            egen tokeep = rowmax(colleagues-dontplan)
                            
                            statplot colleagues-dontplan if inrange(tokeep, 0, 1) [fw=_freq], bar(1, bfcolor(green*0.4)) ///
                            over(gender, label(labsize(vsmall))) yla(0 0.25 "25" 0.5 "50" 0.75 "75", ang(h)) ///
                            ytitle(percent, size(small)) title("I will apply what I learned", size(small)) ///
                            varopts(label(labsize(vsmall))) ysize(7)
                            
                            restore
                            Click image for larger version

Name:	apply_gender.png
Views:	1
Size:	20.9 KB
ID:	1520704

                            Code:
                            *generate effective labels
                            label define effective 5 "Very confident" 4 "Confident" 3 "Somewhat confident" ///
                            2 "Somewhat inconfident" 1 "Not at all confident" 0 "No Answer"
                            
                            *encode Effective
                            encode ColleaguesStaffconsideryouan, gen(effective)
                            
                            *gen effective1-5
                            gen effective1=0
                            gen effective2=0
                            gen effective3=0
                            gen effective4=0
                            gen effective5=0
                            
                            replace effective1=1 if ColleaguesStaffconsideryouan=="1"
                            replace effective2=1 if ColleaguesStaffconsideryouan=="2"
                            replace effective3=1 if ColleaguesStaffconsideryouan=="3"
                            replace effective4=1 if ColleaguesStaffconsideryouan=="4"
                            replace effective5=1 if ColleaguesStaffconsideryouan=="5"
                            
                            *label effective1-effective5
                            label variable effective1    "Not at all confident"
                            label variable effective2    "Somewhat inconfident"
                            label variable effective3    "Somewhat confident"
                            label variable effective4    "Confident"
                            label variable effective5    "Very confident"
                            
                            preserve
                            
                            drop if Education=="No Answer"
                            
                            contract effective1 effective2 effective3 effective4 effective5 education
                            
                            expand _freq
                            
                            egen tokeep = rowmax(effective1-effective5)
                            
                            statplot effective1-effective5 if inrange(tokeep, 0, 1) [fw=_freq], bar(1, bfcolor(green*0.4)) ///
                            over(education, label(labsize(vsmall))) yla(0 0.25 "25" 0.5 "50" 0.75 "75", ang(h)) ///
                            ytitle(percent, size(small)) ///
                            title("How Confident Do You Feel That Your" "Colleagues and Staff Would" "Consider You an Effective Leader?", size(small)) ///
                            varopts(label(labsize(vsmall))) ysize(7)
                            
                            restore

                            Attached Files

                            Comment


                            • #15
                              Sorry, but I can't really add anything very helpful, as to me you're asking the same question with some details different. At some point there is just too much text for a graph to work well. I have flagged the main small tricks I know.

                              Note that when I tried to replicate #10 the graph was not nearly as bad as what you posted. It's possible that there is some sensitivity to operating system, although Stata tries not to let that bite. FWIW, I am using Windows, although not as a matter of taste.

                              I had to Google "Inconfident". Evidently I am British, and not cool, but I knew that. It sounds uncomfortably close to "incontinent". Unless the University of Chicago has banned apostrophes "Bachelor's" is still a possessive.

                              Comment

                              Working...
                              X