Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping with foreach and forval in order to get separate N (observations) and mean values reported in a bar graph

    /*** Dear statalisters ***/

    /*First of all: Any help on this matter is greatly appreciated. I feel I should solve this problem myself since the error feels kind of "obvious", but sadly my logical thinking is limited.
    In order to make it easier for anyone willing to help me, I have written this post so that everything can be copied and pasted into a stata do-file editor (at least I hope I am helping, but maybe it is more trouble than anything else)*/

    /*My problem is this: I am unable to figure out how to mend my code so that I
    get the correct number of observations reported in parentheses (on the left hand side of the graph bars).*/

    //******* MY CODE (re-written to be illustrated by use of Stata's auto.dta, since my own data is from a Norwegian student survey on quality in higher education) ******//

    sysuse auto, clear

    /*In the second and third line of code I am using tabstat to show you the numbers that I want to include in the graph.
    More specifically, for each value (1 - 5) of rep78 I need the mean values for the variables mpg and turn.
    In addition, I also need the number of observations for each value of rep78 for each of the two variables (mpg and turn)
    N is to be reported in the parenteses to the left side of the bars, and mean values are to be reported to the right side (outside) of the bars*/

    set more off
    bys rep78: tabstat mpg turn, stat(mean N) col(stat) format(%12.1f)

    /*As tabstat shows, when rep78 = 5 the variable mpg has a mean of 27.4 and N =11, and turn has a mean of 35.6 and N=11
    (by the way, in my own data N differs between the two variables, so I need to ask for separate Ns)
    Similarly, when rep78 = 4 the variable mpg has a mean of 21.7 and N=18, and turn has a mean of 38.5 and N=18.
    Of course, in a similar fashion I also want these numbers reported separately for rep78 = 1, rep78=2 and rep78=3*/


    //***** my problem ****//

    /*It seems where my code goes wrong has to do with which r(N)'s I succeed in asking Stata to keep in memory (return list).
    The way I have written the code has Stata report the correct means for each variable in each group of rep78,
    but the wrong N for rep78-values 1, 2, 3 and 4. Somehow (because of my lack of coding-experience) I have Stata "remembering" only N for rep78=5,
    which shows up in the graph as N=11 for all values of rep78 for each of the two variables (mpg and turn). What I want is the correct N's for each value of rep78 (for mpg and turn).*/

    //THE CODE:

    set more off
    foreach var of varlist mpg turn {
    forvalues i = 1(1)5 {
    su `var' if rep78 == `i'
    return list
    loc f`var' `"`r(N)'"'
    }
    }
    #delimit ;
    graph hbar (mean) mpg turn,
    over(rep78, relabel(1"1 repair" 2"2 repairs" 3"3 repairs" 4"4 repairs" 5"5 repairs" )
    gap(*2.5) label(labcolor(gs1)))
    showyvars
    yvaroptions(relabel(1 "mpg(n=`fmpg')" 2 "turn(n=`fturn')" 3 "mpg(n=`fnoe12')"
    4 "turn(n=`fnoe13')" 5 "mpg(n=`fnoe12')" 6 "turn(n=`fnoe13')" 7 "mpg(n=`fnoe12')" 8"turn(n=`fnoe13')" 9"mpg(n=`fnoe12')")
    gap(*1.5) label(labcolor(black) labsize(small)))
    bar(1, fcolor(eggshell)) bar(2, fcolor(ltkhaki)) bar(3, fcolor(olive_teal)) bar(4, fcolor(bluishgray)) bar(5, fcolor(ltblue))
    bar(6, fcolor(emidblue)) bar(7, fcolor(erose)) bar(8, fcolor(sandb)) bar(9, fcolor(dkorange))
    blabel(bar, pos(outside) format(%12.1f))
    ysize(3) yla(1(5)45)
    exclude0
    legend(off)
    plotregion(lcolor(none))
    scheme(s1mono)
    title("mpg and turn by repair records..."" ", size(medlarge) span)
    ytitle(" " "Scale: ....." " ",
    size(small))
    note("Note: .....", size(small) span)
    name( test_auto, replace);
    graph save test_auto, replace;
    #delimit cr




    /*Any help on my problem is very much appreciated*/

    /*Best wishes, Johanne*/



  • #2
    The problem is that your inner loop overwrites the same local macro, so it ends containing the count for the last category.

    That aside, I wouldn't do it this way. Your real problem has different numbers of categories, etc., and you shouldn't want to commit to lots of ad hoc programming with locals.

    I'd first get a dataset that is just what you want to plot and then and only then work on the appearance of the graph. This code ignores your carefully chosen appearance options and just shows how to get the basic results, after which you know how to tweak the graph.

    Code:
    set scheme s1color
    
    sysuse auto, clear
    replace mpg = . if inlist(_n, 42, 66)
    keep make rep78 mpg turn
    rename mpg Ympg
    rename turn Yturn
    reshape long Y, i(make) j(which) string
    egen count = count(Y), by(which rep78)
    gen toshow = which + " (n = " + string(count) + ")"
    collapse Y, by(rep78 toshow)
    
    list, sepby(rep78)
    
         +---------------------------------+
         | rep78          toshow         Y |
         |---------------------------------|
      1. |     1     mpg (n = 2)        21 |
      2. |     1    turn (n = 2)        41 |
         |---------------------------------|
      3. |     2     mpg (n = 8)    19.125 |
      4. |     2    turn (n = 8)    43.375 |
         |---------------------------------|
      5. |     3    mpg (n = 29)   19.1379 |
      6. |     3   turn (n = 30)   41.0667 |
         |---------------------------------|
      7. |     4    mpg (n = 18)   21.6667 |
      8. |     4   turn (n = 18)      38.5 |
         |---------------------------------|
      9. |     5    mpg (n = 10)      26.6 |
     10. |     5   turn (n = 11)   35.6364 |
         |---------------------------------|
     11. |     .     mpg (n = 5)      21.4 |
     12. |     .    turn (n = 5)      37.6 |
         +---------------------------------+
    
    
    
    graph hbar (asis) Y, over(toshow) over(rep78) nofill ytitle("")
    Click image for larger version

Name:	johanne.png
Views:	1
Size:	12.9 KB
ID:	1379749


    Comment


    • #3
      Thank you so much for your quick reply, Nick Cox. I will try to do what you are doing with my own data, and if I get stuck I hope it is okay if I return to ask a follow-up question. Clearly, your way of doing it is much more to the point than my messy code. I have a lot to learn when it comes to writing efficient code rather than patching up on problems as I go.

      Thank you for helping me!

      Regards,
      Hilde

      Comment


      • #4
        I think you have a challenging problem and should not feel embarrassed by your code at all. There was only one error in it.

        I was using an ancient version of Stata on a different computer and should underline that

        Code:
         rename mpg Ympg rename turn Yturn
        can in any recent Stata be replaced with

        Code:
        rename (mpg turn) (Y=)
        Further, this code puts n in italic:

        Code:
        set scheme s1color
        
        sysuse auto, clear
        replace mpg = . if inlist(_n, 42, 66)
        keep make rep78 mpg turn
        rename (mpg turn) (Y=) 
        reshape long Y, i(make) j(which) string
        egen count = count(Y), by(which rep78)
        gen toshow = which + " ({it:n} = " + string(count) + ")"
        collapse Y, by(rep78 toshow)
        
        list, sepby(rep78)
        
        graph hbar (asis) Y, over(toshow) over(rep78) nofill ytitle("")
        Last edited by Nick Cox; 23 Mar 2017, 06:42.

        Comment


        • #5
          Hi again!

          Thank you for your words of kindness, Nick Cox, - it really means a lot to me. I must admit I posted my reply to your help before noticing that you had posted another reply, so what I am writing here might seem a bit "out of sync" with our correspondence (I now edited my post, but my "old" reply continues below). I have to leave work now, but tomorrow morning I will dive into the code you are suggesting in your last post. I'm sorry I have to write this in such a hurry, but I did not want to give the impression that I am ungrateful or anything like that. You helping me really means a lot, and I learn so much from it.



          Here is my reply, before editing:

          I rewrote your code to fit with my variable names and data, Nick Cox, and it worked well! Thank you so much for helping me!

          The first part of your code I had to do some reading in order to understand, but I learned a lot from it.

          Unfortunately though, I have a couple of "stupid" questions regarding how to get the graph to look the way I (or rather, my supervisor) need it to look.

          I have a feeling the reason I'm messing this up, is because I am unfamiliar with how to treat the look of bars and labels when I am using the - (asis) - command and the -nofill- command. I looked it up in the Sata manual, but I am still having a hard time getting it right. (In my other graphs showing mean values, I have mostly used the commands - (mean) - and - asyvars -)

          My problems are these:

          1) How do I get the "turn"-bar to appear before the "mpg"-bar for each value of rep78?

          2) How do I give the bars of "turn" and "mpg" different colors?

          3) How do I replace the "bar-names" turn and mpg with a legend (with different bar-colors) for turn and mpg? That is, the colors of the bars and the legend are togheter supposed to illustrate which bar is referring to which variable.

          Can you please help me..? I worry that questions like these are too "stupid" to deserve an answer. However, I figured I'd rather swollow my pride and get a "no", than not ask at all (which is usually what I do).

          Here is my code:

          First, I need value labels for rep78, so I'm adding that just to illustrate better:

          la def rep78 1"1 repair" 2"2 repairs" 3"3 repairs" 4"4 repairs" 5"5 repairs"
          la val rep78 rep78
          ta rep78

          Then I add some stuff (mean values, titles,notes) to the graph, but as you can see, the other things that I need to fix are not working properly.

          #delimit ;
          graph hbar (asis) Y, over(toshow) over(rep78) /*How do I get "turn" to appear before "mpg" (i.e. have the bars show in "reverse" order)?*/
          nofill
          bar(1, fcolor(eggshell) lcolor(black)) bar(2, fcolor(khaki) lcolor(black)) /*How do I get every other bar to have the color -khaki- ? */
          blabel(bar, pos(outside) format(%12.1f))
          legend(lab(1 "turn") lab(2 "mpg")size(small) /*How do I replace the the "bar-names" turn and mpg with a legend for turn and mpg ?*/
          keygap(0.5) symxsize(5) col(2) pos(-2) ring(-1) span)
          ysize(3) yla(1(5)45)
          exclude0
          legend(off)
          plotregion(lcolor(none))
          scheme(s1mono)
          title("Some title..."" ", size(medlarge) span)
          ytitle(" " "Some scale" " ",
          size(small))
          note("Some note"
          "Some other note", size(small) span)
          name(name, replace);
          graph save name, replace;
          #delimit cr



          Regards,
          Hilde
          Last edited by Johanne Karlsen; 23 Mar 2017, 09:52.

          Comment


          • #6
            I think this does almost everything you ask for.

            I can't approve of excluding zero on any bar chart, at least not with this example.

            I don't understand what you want with pos(-2) ring(-1)

            Please use CODE delimiters

            Code:
            like this
            as is explained in FAQ Advice #12.

            Code:
            set scheme s1color
            sysuse auto, clear
            replace mpg = . if inlist(_n, 42, 66)
            keep make rep78 mpg turn
            
            * !!! save variable labels because we need them later
            local label1 "`: var label turn'"
            local label2 "`: var label mpg'"
            
            rename (mpg turn) (Y=)
            reshape long Y, i(make) j(which) string
            egen count = count(Y), by(which rep78)
            gen toshow = which + " ({it:n} = " + string(count) + ")"
            
            collapse Y, by(rep78 toshow)
            list, sepby(rep78)
            
            graph hbar (asis) Y, over(toshow) over(rep78) nofill ytitle("")
            
            la def rep78 1"1 repair" 2"2 repairs" 3"3 repairs" 4"4 repairs" 5"5 repairs"
            la val rep78 rep78
            ta rep78
            
            * !!! two variables => two bar colours
            separate Y, by(strpos(toshow, "mpg") > 0)
            list Y*
            
            #delimit ;
            graph hbar (asis) Y?, over(toshow, descending) over(rep78) nofill
            bar(1, fcolor(eggshell) lcolor(black)) bar(2, fcolor(khaki) lcolor(black))
            legend(order(1 "`label1'" 2 "`label2'") size(small)
            keygap(0.5) symxsize(5) col(2) span)
            ysize(3) yla(0(5)45)
            plotregion(lcolor(none))
            scheme(s1mono)
            title("Some title..."" ", size(medlarge) span)
            ytitle(" " "Some scale" " ",
            size(small))
            note("Some note"
            "Some other note", size(small) span)
            name(name, replace);
            graph save name, replace;
            #delimit cr
            Last edited by Nick Cox; 23 Mar 2017, 11:19.

            Comment


            • #7
              Wow, I really cannot thank you enough, Nick Cox ! As you say, your code does almost everything I need, and the fine details I hope I can figure out by myself. I learn so much simply from reading your code, since I have to refer to the Stata manual in order to understand commands that I have not previously been aware of. This enables me to draw on a broader range of commands when working with data. Thank you also for notifying me on the code delimiter. For some reason I have not noticed that you all are using this to separate code from comments.

              I will now continue adapting your code, so that it works with my own data.

              And again, thank you so much for helping me, for being so patient and for making available this fantastic forum. It is really a marvelous resource to all Stata-users.

              Best wishes,
              Hilde

              Comment


              • #8
                Thanks for the appreciation. StataCorp made the forum available and keep it going.

                Comment


                • #9
                  Hi again,

                  I'm not sure if I am breaking any rule by doing this, but I have a question which has to do with the code in this post and my title of a new post would basically be the same as for this post. Please advice me if I still need to ask this as a separate question in another post.

                  My problem is this: I have been trying to tweak it so that I can make the code in this post - i.e. the code from the auto.dta which Nick Cox helped me with in entry number #6 - run for more than two variables and still show N for each variable. However, regardless of what I do, I end up losing some of the information in the graph, most importantly, N, bar colors and/or legend with description of each bar.

                  Please believe me; I am not simply trying to have someone else do this work for me so that I don't have to.I would much have preferred to be able to understand this by myself but my skills are, unfortunately, not sufficient.

                  If anyone can help me on this matter I will highly appreciate it. As of now I am simply generating several graphs and combining them into one image, but this is not ideal, particularly not when there are 3 or 5 variables.


                  Best wishes,
                  Hilde

                  Comment


                  • #10
                    There are no rules here against revisiting an old thread if the title still applies. On the contrary, it is a good idea.

                    The problem is different. You should show us your code and a reproducible example. Otherwise this is in essence "I changed the code and it no longer works".

                    Comment


                    • #11
                      Thank you, Nick. Then I will do that. I'll be back when I conjure up the right piece of code (it got to be quite many since I tried several different approaches).

                      Comment


                      • #12
                        Okay, so I hope I have correctly inserted code delimiters this time. What I have tried to do in the example below, is to include two more variables in the graph, so that there are four variables in total. I would like to be able to include more variables. For each variable I need the graph to show N and mean value, and I need the bars to have different colors and a legend which denotes the colors of the bars and which states the variable labels.

                        What I do not succeed in is giving each bar a different color. Also, the variables do not appear in the correct order in the final graph. I need the bars to appear in this order: mpg turn trunk headroom. In other words, mpg should be the first bar, then turn, trunk and headroom, and the legend too should reflect this (right now the legend shows the correct order but not colors for all the bars, and the bars themselves are jumbled)

                        Clearly, the problem with the bar colors (and the order of the bars?) is related at least to these lines of codes, but I have not been able to find another command which does for 4 (or more) variables what this code does for 2 variables:

                        Code:
                        * !!! two variables => two bar colours
                        separate Y, by(strpos(toshow, "Mileage") > 0)
                        list Y*


                        //*** The example begins here ***//

                        Code:
                         sysuse auto, clear
                        
                        keep make rep78 mpg turn  trunk  headroom
                        replace mpg = . if inlist(_n, 42, 66)
                        
                        rename mpg      Mileage
                        rename turn     TurnCircle
                        rename trunk      TrunkSpace
                        rename headroom Headroom
                        
                        la var Mileage         "Mileage"
                        la var TurnCircle     "TurnCircle"
                        la var TrunkSpace     "TrunkSpace"
                        la var Headroom     "Headroom"
                        
                        * !!! save variable labels because we need them later
                        local label1 "`: var label Mileage'"
                        local label2 "`: var label TurnCircle'"
                        local label3 "`: var label TrunkSpace'"
                        local label4 "`: var label Headroom'"
                        
                        
                        rename (Mileage TurnCircle  TrunkSpace  Headroom) (Y=)
                        reshape long Y, i(make) j(which) string
                        egen count = count(Y), by(which rep78)
                        gen toshow = which + " (n = " + string(count) + ")"
                        
                        collapse Y, by(rep78 toshow)
                        
                        list, sepby(rep78)
                        
                        graph hbar (asis) Y, over(toshow) over(rep78) nofill ytitle("")  
                        
                        la def rep78 1"1 repair" 2"2 repairs" 3"3 repairs" 4"4 repairs" 5"5 repairs"
                        la val rep78 rep78
                        ta rep78
                        
                        * !!! two variables => two bar colours
                        separate Y, by(strpos(toshow, "Mileage") > 0)
                        list Y*        
                                
                        
                        //Må sette inn 1-5-skala        
                        capture foreach x in  Y? {                
                        g `x' = int(runiform()*44)+1
                        }
                        #delimit ;
                        graph hbar (asis) Y?, over(toshow, descending)stack  over(rep78) nofill
                        bar(1, fcolor(eggshell) lcolor(black)) bar(2, fcolor(khaki) lcolor(black))
                        blabel(bar, pos(outside) format(%12.1f))
                        legend(order(1 "`label1'" 2 "`label2'" 3 "`label3'" 4 "`label4'" 5 "`label5'" 6 "`label6'") size(vsmall)
                        keygap(0.5) symxsize(5) col(6)pos(-2) ring(-1) span)
                        ysize(3)
                            plotregion(lcolor(none))
                            scheme(s1mono)
                            title("Some title:", size(medlarge) span)
                            ytitle(" " "Some scale: 1 = low score, 5 = high score", size(small))
                            ylab(1(5)45) exclude0                                                                                                                                         ///
                            note("Note: Some note", size(small) span)
                            name(fig11b_autodta, replace);
                            graph save fig11b_autodta, replace;
                        #delimit cr
                        Final result (which needs some improvement):
                        Click image for larger version

Name:	fig11b_autodta.png
Views:	1
Size:	58.3 KB
ID:	1391536




                        I guess there is a really simple solution to this matter, but my limited experience with coding stands in the way of me seeing the solution.

                        Any advice is highly appreciated.

                        Comment


                        • #13
                          My suggestion is that the legend is redundant if you have the same explanatory text elsewhere. Different colours you can have without too much effort. You may find this code easier to use as a template:

                          Code:
                          sysuse auto, clear 
                          
                          collapse  ///
                          (count) countmpg=mpg  ///
                          (count) countturn=turn  ///
                          (count) counttrunk=trunk  ///
                          (count) countheadroom=headroom  ///
                          (mean) meanmpg=mpg    /// 
                          (mean) meanturn=turn    ///
                          (mean) meantrunk=trunk    ///
                          (mean) meanheadroom=headroom    ///
                          , by(rep78)
                          
                          reshape long count mean, i(rep78) j(which) string  
                          
                          label define order 1 mpg 2 turn 3 trunk 4 headroom 
                          encode which, gen(order) label(order) 
                          
                          label def order 1 "text on mpg", modify 
                          label def order 2 "text on turn", modify 
                          label def order 3 "text on trunk", modify 
                          label def order 4 "text on headroom", modify 
                          
                          gen detail = " ({it:n} = " + string(count) + ")"
                          
                          egen group = group(order detail), label 
                          
                          separate mean, by(order) 
                          
                          graph hbar (asis) mean?, over(group) over(rep78) nofill legend(off)

                          Click image for larger version

Name:	karlsen.png
Views:	1
Size:	35.2 KB
ID:	1391558

                          Comment


                          • #14
                            Thank you so much Nick! I have not yet ran your code, but I will test it now. From what I can see, your code simplifies my whole process a lot, which is really helpful.

                            Regarding the legend I completely agree with you. However, I am just a minion in this matter. My superviser wants the graphs to look a certain way and I am trying to my best to give him that. He takes some advice from me, but he has his reasons for why he wants it a certain way. What he really wants is a legend but no label to the left of the bars (i.e. only "N). The way I have solved this so far, is manually deleting the labels (using the Stata Graph Editor) and leaving only the N in a parenthesis.

                            Is there a way to drop the labels to the left of N, and rather have a legend show the labels and the colors of the bars?

                            I am really sorry to bother you with such minor details.
                            Last edited by Johanne Karlsen; 09 May 2017, 02:27.

                            Comment


                            • #15
                              You can get closer to that by

                              1. making the value labels such as "text for mpg" just spaces

                              2. switching the legend on

                              3. supplying variable labels for the separate variables.

                              Try something like

                              Code:
                              sysuse auto, clear 
                              
                              local j = 0 
                              foreach v in mpg turn trunk headroom { 
                                  local ++j 
                                  local lbl`j' `"`: var label `v''"'
                                    if `"`lbl`j''"' == "" local lbl`j' "`v'" 
                              }
                              
                              collapse  ///
                              (count) countmpg=mpg  ///
                              (count) countturn=turn  ///
                              (count) counttrunk=trunk  ///
                              (count) countheadroom=headroom  ///
                              (mean) meanmpg=mpg    /// 
                              (mean) meanturn=turn    ///
                              (mean) meantrunk=trunk    ///
                              (mean) meanheadroom=headroom    ///
                              , by(rep78)
                              
                              reshape long count mean, i(rep78) j(which) string  
                              
                              label define order 1 mpg 2 turn 3 trunk 4 headroom 
                              encode which, gen(order) label(order) 
                              
                              label def order 1 " ", modify 
                              label def order 2 " ", modify 
                              label def order 3 " ", modify 
                              label def order 4 " ", modify 
                              
                              gen detail = "({it:n} = " + string(count) + ")"
                              
                              egen group = group(order detail), label 
                              
                              separate mean, by(order) 
                              
                              forval j = 1/4 { 
                                  label var mean`j' `"`lbl`j''"'
                              }
                              
                              graph hbar (asis) mean?, over(group) over(rep78) nofill

                              Comment

                              Working...
                              X