Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to add the average of a variable by a dot for each country on a Graph Bar?

    Hi Dear;

    Please, I would add an average of a second variable in the graph for each country by a dot.

    Code:
    foreach var of varlist rule_law gov_effec gov_exp_health_gdp infant_mortality {
    local var_title: variable label `var'
    graph bar (mean) `var',  over(com_exp, label(labcolor(maroon) labsize(small)))  ///
    over(country,gap(*.4) label(labcolor(midblue) angle(90) labgap(*1) labsize(tiny))  sort(`var'))  asyvars ///
    scheme(vg_brite)  nofill ytitle("{bf: Average `var_title'}", size(vsmall))  ///
    legend(size(small) nobox region(lcolor(white))) ///
    title ("{bf: Average `var_title' by Commodity Category over 2000-2016}", size(vsmall)) ///
    graph export "C:\Users\Perron\Downloads\Clairant\avga_`var'.png", as(png) width(1200) replace 
    
    }
    Click image for larger version

Name:	Graph.PNG
Views:	2
Size:	54.4 KB
ID:	1450243

  • #2
    Hi Firmin,

    It would help so much if you shared a snippet of data. If you read the FAQs, linked at the top left of this page, it provides a good justification for doing so (it makes it more likely you get a helpful response). That said, if you want to overlay plots on top of each other you likely want to use twoway (not graph bar).

    I've made up some fake data that demonstrate the skeleton code i've written below. And i've had to totally guess how your data are structured.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str12 country float(infantmortality measure mean) byte country2
    "Argentina"     15 96.6 41.58 2
    "Argentina"     15 18.5 41.58 2
    "Argentina"     15 63.2 41.58 2
    "Argentina"     15   27 41.58 2
    "Argentina"     15  2.6 41.58 2
    "Chile"         10 17.1 43.94 1
    "Chile"         10 82.3 43.94 1
    "Chile"         10 77.6 43.94 1
    "Chile"         10  7.9 43.94 1
    "Chile"         10 34.8 43.94 1
    "Columbia"      25 61.8  47.1 3
    "Columbia"      25 13.3  47.1 3
    "Columbia"      25 89.1  47.1 3
    "Columbia"      25 39.4  47.1 3
    "Columbia"      25 31.9  47.1 3
    "Liberia"      100 69.6 63.32 6
    "Liberia"      100 99.1 63.32 6
    "Liberia"      100 76.2 63.32 6
    "Liberia"      100 41.4 63.32 6
    "Liberia"      100 30.3 63.32 6
    "South Africa"  60 60.2 51.82 4
    "South Africa"  60 42.8 51.82 4
    "South Africa"  60 68.8 51.82 4
    "South Africa"  60 19.1 51.82 4
    "South Africa"  60 68.2 51.82 4
    "Sudan"         80 19.3 21.32 5
    "Sudan"         80 17.8 21.32 5
    "Sudan"         80 24.8 21.32 5
    "Sudan"         80 26.9 21.32 5
    "Sudan"         80 17.8 21.32 5
    end
    label values country2 country2
    label def country2 1 "Chile", modify
    label def country2 2 "Argentina", modify
    label def country2 3 "Columbia", modify
    label def country2 4 "South Africa", modify
    label def country2 5 "Sudan", modify
    label def country2 6 "Liberia", modify
    Code:
    bysort country: egen mean = mean(measure) //generate the mean by country
    
    **Skeleton code for a graph (graph attached below):
    twoway (bar infantmortality country2, barw(0.4)) (scatter mean country2, mlcol(white) msize(medlarge)), xlab(,valuelabel) graphregion(col(white)) ylab(,angle(0)) xtitle("") legend(lab(1 "Infant mortality") lab(2 "Mean of some other var") region(c(none)))
    Another option is to collapse the data by country, extracting the mean of infant mortality and this other variable you want to plot.

    A larger problem though is that the scale for the new variable may be significantly different from the scale of the existing plot. If so, you may want to consider adding a second y-axis on the right-hand side.
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	78.5 KB
ID:	1450248

    Last edited by Chris Larkin; 22 Jun 2018, 22:27.

    Comment


    • #3
      Chris: Good answer, except that the good folks of Colombia might want to underline that they are some way from DC.

      Firmin: I would make that graph horizontal bars, not vertical.
      Last edited by Nick Cox; 23 Jun 2018, 01:05.

      Comment


      • #4
        Hi Dear Chris and Nick,

        Many thanks for your suggestions. Unfortunately, I have not had the desired graph. Indeed, I collapse to calculate the averages by country. But the output is not what I wanted.

        Code:
        twoway (bar av_rule_law countrys, barw(0.1) sort(av_rule_law )) (scatter av_rule_law_2000 countrys, mlcol(white) msize(medlarge)), ///
        xlab(,valuelabel) graphregion(col(white)) ylab(,angle(0)) xtitle("") legend(lab(1 "Infant mortality") lab(2 "Mean of some other var") region(c(none)))
        Please find attached an example of my data.

        With the code Dear Chris, I get the following graph.
        Click image for larger version

Name:	Chris_proposal.PNG
Views:	1
Size:	89.6 KB
ID:	1450285

        All countries do not appear on the axis (I have about 78 countries). In addition, I would also like to distinguish the bar by Commodity category (variable: com_exp in database) as I did on this chart below without forgetting to sort from the lowest value to the highest value by country.
        Click image for larger version

Name:	Graph.PNG
Views:	2
Size:	54.4 KB
ID:	1450286


        Thanks,

        Best regards,


        Comment


        • #5
          You got what you asked for. With xla(, valuelabel) you get default choices for tick positions. With something more like xla(1/78, valuelabel ang(v)) you will see more labels, but as already implied by #1 you will struggle to make such a graph readable.

          Comment


          • #6
            Originally posted by Nick Cox View Post
            Chris: Good answer, except that the good folks of Colombia might want to underline that they are some way from DC.
            haha I actually think it was the University in the City of New York that was front of mind!

            Firmin, you still haven't shared any data. You say you have but you definitely haven't; you need to use dataex (SSC) as described in the FAQs. I think with Nick's comment you should have all you need now anyway. I completely agree that a horizontal display would be much better in this instance. Making up the fake data for #2, I almost got a crick in my neck from trying to read your first graph

            If you're still not getting what you want, let me know (and share some data)



            Comment


            • #7
              Hi Firmin,

              Re-reading #4, I see your question around how to make some bars a different color depending on that country's value in some other variable remains unanswered. There may be a smarter way of doing this but i've just overlaid two bar graphs (one where othercat==0 and another where othercat==1) and made the bars different colors. I've also taken the liberty of showing what the graph would look like if you make it horizontal and add in another x-axis.

              Code:
              clear
              input str12 country float(infantmortality measure mean) byte country2 float othercat
              "Argentina"     15 96.6 41.58 2 0
              "Argentina"     15 18.5 41.58 2 0
              "Argentina"     15 63.2 41.58 2 0
              "Argentina"     15   27 41.58 2 0
              "Argentina"     15  2.6 41.58 2 0
              "Chile"         10 17.1 43.94 1 1
              "Chile"         10 82.3 43.94 1 1
              "Chile"         10 77.6 43.94 1 1
              "Chile"         10  7.9 43.94 1 1
              "Chile"         10 34.8 43.94 1 1
              "Colombia"      25 61.8  47.1 3 0
              "Colombia"      25 13.3  47.1 3 0
              "Colombia"      25 89.1  47.1 3 0
              "Colombia"      25 39.4  47.1 3 0
              "Colombia"      25 31.9  47.1 3 0
              "Liberia"      100 69.6 63.32 6 0
              "Liberia"      100 99.1 63.32 6 0
              "Liberia"      100 76.2 63.32 6 0
              "Liberia"      100 41.4 63.32 6 0
              "Liberia"      100 30.3 63.32 6 0
              "South Africa"  60 60.2 51.82 4 0
              "South Africa"  60 42.8 51.82 4 0
              "South Africa"  60 68.8 51.82 4 0
              "South Africa"  60 19.1 51.82 4 0
              "South Africa"  60 68.2 51.82 4 0
              "Sudan"         80 19.3 21.32 5 1
              "Sudan"         80 17.8 21.32 5 1
              "Sudan"         80 24.8 21.32 5 1
              "Sudan"         80 26.9 21.32 5 1
              "Sudan"         80 17.8 21.32 5 1
              end
              label values country2 country2
              label def country2 1 "Chile", modify
              label def country2 2 "Argentina", modify
              label def country2 3 "Colombia", modify
              label def country2 4 "South Africa", modify
              label def country2 5 "Sudan", modify
              label def country2 6 "Liberia", modify
              Code:
              **Skeleton code for a graph (graph attached below):
              twoway (bar infantmortality country2 if othercat == 0, hori col(orange*0.5) barw(0.7) yaxis(1) xaxis(1)) ///
                     (bar infantmortality country2 if othercat == 1, hori col(magenta*0.4) barw(0.7) yaxis(1) xaxis(1)) ///
                     (scatter country2 mean, mlcol(white) msize(large) yaxis(1) xaxis(2)), ///
                     graphregion(col(white)) bgcol(white) ytit("") ylab(,valuelabel angle(0)) ///
                     xlab(,angle(0) axis(1)) xsc(titlegap(*8) axis(1)) xtit("Infant mortality", axis(1)) ///
                     xlab(,angle(0) axis(2)) xsc(titlegap(*8) axis(2)) xtit("Average value of other var", axis(2)) ///
                     legend(lab(1 "Some other cat") lab(2 "Not some other cat") order(1 2) region(c(none)))
              Attached Files
              Last edited by Chris Larkin; 23 Jun 2018, 11:51.

              Comment


              • #8
                Many thanks Dear Nick,

                Please, is it possible to add over () at the graph bar level to distinguish the countries following commodity exporting type. For example, exporting countries in blue bar and non-exporters in green?

                Is it also possible to sort according to the values of the average infantile_rate?

                Thanks !

                Best.

                Comment


                • #9
                  Dear Chris,

                  Many thanks, very very useful.

                  Comment


                  • #10
                    Please Dear Chris,
                    I have not used the two axes because the gap between the two quantitative variables was low. Thanks again for your code,
                    Is it possible to sort the countries by one of the averages?

                    Code:
                    twoway (bar av_rule_law countrys if com_exp == 1,  col(orange*0.5) barw(0.7) )  ///
                           (bar av_rule_law countrys if com_exp == 2,  col(magenta*0.4) barw(0.7))  ///
                           (scatter av_rule_law_2000 countrys, mlcol(white) msize(medlarge) ), ///
                           xlab(1/78,valuelabel ang(v) labsize(tiny) labcolor(midblue)) graphregion(col(white)) ylab(,angle(0)) xtitle("") ///
                           legend(lab(1 "Oil exporting countries") lab(2 "Non-oil exporting countries")  region(c(none)))
                    Click image for larger version

Name:	exp_graph.PNG
Views:	1
Size:	114.3 KB
ID:	1450337

                    Comment


                    • #11
                      Sure it is! Really, it would be great if you shared some data, then i'll know how your country variable is formatted.

                      I used sencode (ssc) to do it

                      Comment


                      • #12
                        Dear Chris,

                        Many thanks for helpful, Please, find attached the database :


                        Code:
                        encode country, gen (countrys)
                        
                        twoway (bar av_rule_law_2016 countrys if com_exp == 1,  col(orange*0.5) barw(0.7) )  ///
                               (bar av_rule_law countrys if com_exp == 2,  col(magenta*0.4) barw(0.7))  ///
                               (scatter av_rule_law_2000 countrys, mlcol(white) msize(medlarge) ), ///
                               xlab(1/78,valuelabel ang(v) labsize(tiny) labcolor(midblue)) graphregion(col(white)) ylab(,angle(0)) xtitle("") ///
                               legend(lab(1 "Oil exporting countries") lab(2 "Non-oil exporting countries")  region(c(none)))
                        Finally, I want to consider the graph for the year 2016 and the scatter graph for 2000. I want to sort the data according to the 2016 values and commodity exporting type (com_exp).

                        Best regards,


                        Attached Files

                        Comment


                        • #13
                          Hi Firmin,

                          Reading the help file for sencode (ssc) will give you what you need.

                          If you'd read the FAQ as I suggested #2 and #6 you'd know that people on Statalist are unlikely to download data sets from this forum and that you should use dataex (ssc). I also mention dataex explicitly in #6.

                          So, share your data using dataex and I may be able to help, or perhaps someone else wants to download your dataset, or someone else can help without doing so.

                          Comment


                          • #14
                            Hi dear Chris,

                            Many thanks for your suggestions for dataex. Coming soon, I will use it as you suggested.

                            Please, I used sencode to generate my country variable (string) to numeric - countrys.

                            But the graph bar is not always sorted from the smallest to the largest?

                            Code:
                            
                            sencode country, g(countrys)
                            
                            twoway (bar av_rule_law countrys if com_exp == 1,  col(orange*0.5) barw(0.7)  )  ///
                                   (bar av_rule_law countrys if com_exp == 2,  col(magenta*0.4) barw(0.7) )  ///
                                   (scatter av_rule_law_2000 countrys, mlcol(white) msize(medlarge) ), ///
                                   xlab(1/78,valuelabel ang(v) labsize(tiny) labcolor(midblue)) graphregion(col(white)) ylab(,angle(0)) xtitle("") ///
                                   legend(lab(1 "Oil exporting countries") lab(2 "Non-oil exporting countries") order(1 2 3) region(c(none)))
                            Please, did you know why?

                            Thanks,

                            Best,

                            Comment


                            • #15
                              No I do not. If you shared data using dataex I could see how your variables are encoded and I could probably tell you. That is the necessary first step.

                              Taken from the FAQs:

                              As from Stata 15.1 (and 14.2 from 19 December 2017), dataex is included with the official Stata distribution. Users of Stata 15 (or 14) must update to benefit from this.

                              Users of earlier versions of Stata must install dataex from SSC before they can use it. Type ssc install dataex in your Stata.

                              The merits of dataex are that we see your data as you do in your Stata. We see whether variables are numeric or string, whether you have value labels defined and what is a consequence of a particular display format. This is especially important if you have date variables. We can copy and paste easily into our own Stata to work with your data.

                              Comment

                              Working...
                              X