Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to overlay distributions using a pie chart using StataMP 17? (plot distributions on the same graph)

    Dear Statalist,

    I hope you are well.

    Please, I need help on how to overlay distributions using a pie chart using StataMP 17?

    In my paper, I made a pie chart - using the below syntax - I found the Distribution of ā€˜i’ more extreme for micro-businesses. Therefore, I have been advised to overlay the distributions (said: just plot both distributions on the same graph).

    Code:
    graph pie, over( Int_RATE_01 ) plabel(_all percent) by(, legend(on)) by( Ent_size )
    Another syntax I used to make a pie chart that shows interest rate distribution by bank finance application status (whether received finance or not)

    Code:
    graph pie, over( Int_RATE_01 ) plabel(_all percent) by(, legend(on)) by( Finance_App_Status )

    The questionnaire question of the related information is this: What was the average interest rate you paid for the loan? The nature of the data is categorical variables. The response categories are as follows:
    1. 0-3% (number of the responses 51)
    2. 4-5% (number of the responses 7)
    3. 6-7% (number of the responses 13)
    4. 8-9% (number of the responses 12)
    5. Over 10% (number of the responses 0)
    Here is a partial dataset of about 83 firms (Small and Medium Enterprises):

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int firm float Ent_size long(Int_RATE_01 Finance_App_Status)
     1 4 2 0
     2 4 1 1
     3 3 1 0
     4 3 4 1
     5 4 4 1
     6 3 1 1
     7 3 4 1
     8 1 4 1
     9 1 3 1
    10 1 1 1
    11 1 1 1
    12 1 2 1
    13 3 1 0
    14 1 1 0
    15 1 2 1
    16 3 1 1
    17 4 3 1
    18 4 1 1
    19 4 1 1
    20 1 1 1
    21 3 1 0
    22 1 1 1
    23 4 1 1
    24 1 1 1
    25 3 4 1
    26 1 1 1
    27 3 3 1
    28 1 4 1
    29 4 2 1
    30 3 3 1
    31 4 1 1
    32 1 1 0
    33 3 2 1
    34 4 3 1
    35 1 3 1
    36 3 3 1
    37 1 1 1
    38 3 4 1
    39 3 1 1
    40 3 1 1
    41 3 1 1
    42 1 1 1
    43 3 1 1
    44 3 1 1
    45 1 1 1
    46 1 4 1
    47 1 4 1
    48 4 4 1
    49 4 2 1
    50 1 1 1
    51 4 1 1
    52 3 3 1
    53 1 1 1
    54 4 1 1
    55 3 3 1
    56 4 1 1
    57 4 1 1
    58 4 1 1
    59 3 3 1
    60 1 4 1
    61 3 1 1
    62 1 1 1
    63 3 1 1
    64 4 3 1
    65 4 1 1
    66 3 1 1
    67 3 1 1
    68 1 1 1
    69 1 1 1
    70 3 1 1
    71 4 2 1
    72 3 1 1
    73 4 3 1
    74 4 1 1
    75 1 4 1
    76 3 1 1
    77 1 1 1
    78 1 1 1
    79 1 1 1
    80 3 3 1
    81 4 1 1
    82 4 1 1
    83 3 1 1
    end
    label values Ent_size Ent_size
    label def Ent_size 1 "Micro", modify
    label def Ent_size 3 "small", modify
    label def Ent_size 4 "Medium", modify
    label values Int_RATE_01 Int_RATE
    label def Int_RATE 1 "0 -3%", modify
    label def Int_RATE 2 "4 -5%", modify
    label def Int_RATE 3 "6 -7%", modify
    label def Int_RATE 4 "8 -9%", modify
    label values Finance_App_Status Needing_Funding
    label def Needing_Funding 0 "Not Recieved", modify
    label def Needing_Funding 1 "Recieved", modify
    I am struggling with how to answer the above question

    Greatly appreciate your help

    Best regards,
    Rabab
    Last edited by Rabab Al hasni; 27 Sep 2021, 08:09.

  • #2
    I don't get a clear picture of how you want to modify your pie charts. I rather start from the position that even simple pie charts are inherently problematic and are not made better by making them more complicated.

    Backing up, it seems that you have a 3 way table with interest rate as outcome to explain and the other variables as predictors. That suggests to me a three-way bar chart using tabplot from the Stata Journal.

    Here is a sample effort. The major paper was in 2016, the latest software update was in 2020 and there's an executive summary at https://www.statalist.org/forums/for...updated-on-ssc

    . search tabplot, sj

    Search of official help files, FAQs, Examples, and Stata Journals

    SJ-20-3 gr0066_2 . . . . . . . . . . . . . . . . Software update for tabplot
    (help tabplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
    Q3/20 SJ 20(3):757--758
    added new options frame() and frameopts() allowing framing
    of bars and so-called thermometer plots or charts

    SJ-16-2 gr0066 . . . . . . Speaking Stata: Multiple bar charts in table form
    (help tabplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
    Q2/16 SJ 16(2):491--510
    provides multiple bar charts in table form representing
    contingency tables for one, two, or three categorical variables



    Code:
    tabplot Int_ Ent_, by(Fin, t1title(Needing funding) note("")) yreverse bfcolor(green*0.2) percent(Ent_ Fin) showval  xtitle(Entity size) ytitle(Interest rate) subtitle(, fcolor(blue*0.1)) scheme(s1color)
    I made some tweaks to your data example (thanks!) removing spaces and fixing typos. More importantly, there is plenty of scope to modify this plot, e.g. by showing frequencies rather than percents or basing the percents differently.
    Click image for larger version

Name:	int_ent_fin.png
Views:	1
Size:	23.9 KB
ID:	1629250


    Comment


    • #3
      I agree with Nick (again). However, visually the variable needing funding is given a special role (there is a collored box around its values), what you realy want is two predictors to have equal "visual weight" (if one gets a box than the other gets a box as well). This is what twby prefix can do. To install type in Stata ssc install twby . You have to do more work with twby, so it is not as convenient as tabplot.

      Code:
      clear
      input int firm float Ent_size long(Int_RATE_01 Finance_App_Status)
       1 4 2 0
       2 4 1 1
       3 3 1 0
       4 3 4 1
       5 4 4 1
       6 3 1 1
       7 3 4 1
       8 1 4 1
       9 1 3 1
      10 1 1 1
      11 1 1 1
      12 1 2 1
      13 3 1 0
      14 1 1 0
      15 1 2 1
      16 3 1 1
      17 4 3 1
      18 4 1 1
      19 4 1 1
      20 1 1 1
      21 3 1 0
      22 1 1 1
      23 4 1 1
      24 1 1 1
      25 3 4 1
      26 1 1 1
      27 3 3 1
      28 1 4 1
      29 4 2 1
      30 3 3 1
      31 4 1 1
      32 1 1 0
      33 3 2 1
      34 4 3 1
      35 1 3 1
      36 3 3 1
      37 1 1 1
      38 3 4 1
      39 3 1 1
      40 3 1 1
      41 3 1 1
      42 1 1 1
      43 3 1 1
      44 3 1 1
      45 1 1 1
      46 1 4 1
      47 1 4 1
      48 4 4 1
      49 4 2 1
      50 1 1 1
      51 4 1 1
      52 3 3 1
      53 1 1 1
      54 4 1 1
      55 3 3 1
      56 4 1 1
      57 4 1 1
      58 4 1 1
      59 3 3 1
      60 1 4 1
      61 3 1 1
      62 1 1 1
      63 3 1 1
      64 4 3 1
      65 4 1 1
      66 3 1 1
      67 3 1 1
      68 1 1 1
      69 1 1 1
      70 3 1 1
      71 4 2 1
      72 3 1 1
      73 4 3 1
      74 4 1 1
      75 1 4 1
      76 3 1 1
      77 1 1 1
      78 1 1 1
      79 1 1 1
      80 3 3 1
      81 4 1 1
      82 4 1 1
      83 3 1 1
      end
      label values Ent_size Ent_size
      label def Ent_size 1 "Micro", modify
      label def Ent_size 3 "small", modify
      label def Ent_size 4 "Medium", modify
      label values Int_RATE_01 Int_RATE
      label def Int_RATE 1 "0 -3%", modify
      label def Int_RATE 2 "4 -5%", modify
      label def Int_RATE 3 "6 -7%", modify
      label def Int_RATE 4 "8 -9%", modify
      label values Finance_App_Status Needing_Funding
      label def Needing_Funding 0 "Not Recieved", modify
      label def Needing_Funding 1 "Recieved", modify
      
      // set the scheme
      set scheme s1color
      
      // make distance between size categories equal
      replace Ent_size = Ent_size -1 if Ent_size > 1
      label def Ent_size 2 "small", modify
      label def Ent_size 3 "Medium", modify
      
      // add some variable labels, which will be used to label the axes
      label var Int_RATE_01 "Interest rate"
      label var Ent_size "Size of enterprice"
      label var Finance_App_Status "Application status"
      
      // create the table
      contract Int_RATE_01 Ent_size Finance_App_Status, nomiss zero
      egen tot = total(_freq), by(Ent_size Finance_App_Status)
      gen perc = _freq / tot * 100
      
      // to display the numbers
      format perc %5.0f
      gen y = -10
      
      // the graph
      twby Ent_size Finance_App_Status,                            ///
              compact left xoffset(0.5) legend(off)                ///
              title("Percentage in each interest category"         ///
                    "given Application status and size") :         ///
          twoway bar perc Int_RATE_01,                             ///
              xlab(1/4, val) barwidth(.8) ylab(none) ytitle("") || ///
          scatter y Int_RATE_01 if perc > 0,                       ///
          msymbol(none) mlab(perc) mlabpos(0) mlabcolor(black)
      Click image for larger version

Name:	Graph.png
Views:	1
Size:	62.8 KB
ID:	1629287
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment


      • #4
        Dear Nick Cox and Maarten Buis,


        Many thanks for your prompt reply. your great explanation is very helpful and fruitful.

        In fact, the reviewer send me this comment "Distribution of ā€˜i’ more extreme for micro-businesses. Could you overlay these distributions?" regarding the below pie chart:

        Code:
        graph pie, over( Int_RATE_01 ) plabel(_all percent) by(, legend(on)) by( Ent_size )
        so I was wondering how to overlay distributions of the pie chart using StataMP 17. I thought maybe he want me to overlay two pie charts of a relevant questionnaire question.

        I would like to ask please, what does mean the 100.0 that appears under each bar of those who have not received finance of the Nick Cox post #2? is it a number or percentage? How to read/ interpret it?


        If I would like to overlay distributions of the pie chart using twoway bar chart of interest rate and enterprise size, is the below syantax correct (using same dataset of mine):

        Code:
        tabplot Int_RATE_01 Ent_size , yreverse bfcolor(green*0.2) percent(Ent_ Fin) showval xtitle( Ent_size ) ytitle( Int_RATE_01 ) subtitle(, fcolor(blue*0.1)) scheme(s1color)
        How to interpret the figure that appears under each bar of the chart?


        Many thanks for being available to help us in improving our analysis through Stata.

        Best regards,
        Rabab
        Last edited by Rabab Al hasni; 28 Sep 2021, 12:49.

        Comment


        • #5
          Dear Nick Cox and Maarten Buis,

          Below syntax shows the number of firms under each bar, how to get a percentage for the frequency of enterprises that have received or not finance based on interest rate?

          Code:
          tabplot Int_RATE_01 Ent_size , by( Finance_App_Status , compact note("")) showval(mlabsize(*.5)) ysize(7)

          I have tried to past the graph here but it does not work so I upload it as an attachment in this post.


          Greatly appreciate your help

          Best regards,
          Rabab
          Attached Files
          Last edited by Rabab Al hasni; 28 Sep 2021, 13:38.

          Comment


          • #6
            Various comments on #4 and #5.

            The graphs in #2 and #3 show the probability of a value in each interest rate category, given entity size and finance received, as a %. All the values in the data example for micro and finance not received fall into the lowest interest rate category, so 100% is shown. And so on. What is shown follows in my case from the call to the option percent() and in Maarten's case from his calculating the percent concerned before he calls up the graph. In a paper I would explain that on the graph or through a text caption.

            If you want to see frequencies, omit the percent() call in tabplot or for Maarten's approach calculate the frequencies beforehand.

            If you wish to see different percents, vary the percent() call in tabplot. See the help for examples.

            Sorry, but I don't understand what the reviewer is asking for any more than I did in #2. You will need to contact them. If the review was anonymous, the journal editors may be willing to forward a question to the reviewer. They may be wanting you to add numeric annotation or to add something graphical. There is no point in my trying to guess, not least because the reviewer appears to think that pie charts are a good start, and I strongly disagree with that.

            The graph attachment in #6 doesn't display properly because you need to attach .png not .gph -- this is explained in the FAQ Advice

            Comment


            • #7
              Originally posted by Nick Cox View Post
              Various comments on #4 and #5.

              The graphs in #2 and #3 show the probability of a value in each interest rate category, given entity size and finance received, as a %. All the values in the data example for micro and finance not received fall into the lowest interest rate category, so 100% is shown. And so on. What is shown follows in my case from the call to the option percent() and in Maarten's case from his calculating the percent concerned before he calls up the graph. In a paper I would explain that on the graph or through a text caption.

              If you want to see frequencies, omit the percent() call in tabplot or for Maarten's approach calculate the frequencies beforehand.

              If you wish to see different percents, vary the percent() call in tabplot. See the help for examples.

              Sorry, but I don't understand what the reviewer is asking for any more than I did in #2. You will need to contact them. If the review was anonymous, the journal editors may be willing to forward a question to the reviewer. They may be wanting you to add numeric annotation or to add something graphical. There is no point in my trying to guess, not least because the reviewer appears to think that pie charts are a good start, and I strongly disagree with that.

              The graph attachment in #6 doesn't display properly because you need to attach .png not .gph -- this is explained in the FAQ Advice



              Dear Nick Cox,

              Many thanks for your clarification

              I will try to re-attach the graph again using gph

              Thank you very much

              Best regards,
              Rabab

              Comment


              • #8
                No; use .png not .gph; that is the point. You can save your graph as .png within Stata from a Graph window.

                Comment


                • #9
                  Ok, Nick
                  Thank you very much for giving your time to help and sorry for my silly mistakes when I post enquiries through the Stata forum.

                  Best regards,
                  Rabab

                  Comment


                  • #10
                    Originally posted by Nick Cox View Post
                    No; use .png not .gph; that is the point. You can save your graph as .png within Stata from a Graph window.


                    Dear Nick Cox,

                    For one of the attached graphs, I reached the answer regarding how to make the bars of the graph vary in percentage - thank you for that.

                    Regarding the other attachment, I would like to ask if the below syntax describes the twoway table of interest rate and enterprises size:

                    Code:
                    tabplot Int_RATE_01 Ent_size , by( Finance_App_Status , compact note("")) showval(mlabsize(*.5)) ysize(7)
                    I have tried to make the bars of this graph vary in percentage so to be all in a total of 100% but could not. Could you please help with that?


                    Many thanks,

                    Best regards,
                    Rabab
                    Attached Files

                    Comment


                    • #11
                      We can see a graph now, which is good progress.

                      I can't see that the code you show bears any exact relation to the graph you show. First off, in your code the option by() would imply a graph with two panels. Second off, the subtitle shown does not appear in your code and (believe me) would not appear unless you asked for it.

                      Backing up, various possibilities include


                      Code:
                      percent
                      each bar shows a percent of the total for all data

                      Code:
                      percent(Ent_size)
                      percents add to 100 within groups of Ent_size

                      similar comments for any other variable

                      Code:
                      percent(Ent_size Fin)
                      percents add to 100 within cross-combinations of those two variables.

                      It seems to me that you are guessing wildly and trying all kinds of syntax and getting confused even on which syntax produces which graph.

                      That's no good for you, and for anyone trying to understand what is going on.

                      When you're developing graph code you can do this: add an option like

                      Code:
                      name(G1, replace)
                      and then that graph will appear in window G1 and you can follow what that syntax does. Similarly you can have other graphs with commands including say

                      Code:
                      name(G2, replace)

                      Comment


                      • #12
                        Dear Nick Cox,

                        Thank you for your patient

                        I am so sorry, I pasted in my post #10 the wrong syntax. The syntax that is referred to the attached graph in my post #10 is this

                        Code:
                        tabplot Int_RATE_01 Ent_size , yreverse bfcolor(green*0.2) percent(Ent_ Fin) showval xtitle( Ent_size ) ytitle( Int_RATE_01 ) subtitle(, fcolor(blue*0.1)) scheme(s1color)

                        In which I was asking if it describes a two-way or one bar chart.

                        Your explanations have helped me to answer my research question and solved its related issue. indeed, I value and respect your help so much


                        Deeply, thank you very much for giving your attention and time

                        Best regards,
                        Rabab

                        Comment


                        • #13
                          Thanks for the warm closure. Glad it worked.

                          Comment

                          Working...
                          X