Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can you make a by() bar graph plot side-by-side bars on one plot rather than two subfigures?

    I want to change the attached graph so that instead of appearing as two sub-figures, there is just one plot with three pairs of bars, corresponding to:

    same and vaccinated
    same and not vaccinated
    decrease and vaccinated
    decrease and not vaccinated
    increase and vaccinated
    increase and not vaccinated

    My basic code to produce the graph right now is
    Code:
     graph hbar, by(vaccinated) over(prob_infection_change_sign)
    .

    How could I adjust this to get what I want?

    (Unfortunately, I can't share my actual data here as it is somewhat confidential. I've changed the names of variables to an unrelated topic. )
    Attached Files

  • #2
    The question of confidential data is addressed explicitly in the FAQ Advice you were asked to read before posting.

    https://www.statalist.org/forums/help#stata

    We can understand your dataset only to the extent that you explain it clearly.

    The best way to explain it is to show an example. The community-contributed command dataex makes it easy to give simple example datasets in postings. It was written to support Statalist and its use is strongly recommended. Usually a copy of 20 or so observations from your dataset is enough to show your problem. See help dataex for details.

    As from Stata 15.1 (and 14.2 from 19 December 2017), dataex is included with the official Stata distribution. Users of Stata 15 (or 14) must update to benefit from this.

    Users of earlier versions of Stata must install dataex from SSC before they can use it. Type ssc install dataex in your Stata.

    The merits of dataex are that we see your data as you do in your Stata. We see whether variables are numeric or string, whether you have value labels defined and what is a consequence of a particular display format. This is especially important if you have date variables. We can copy and paste easily into our own Stata to work with your data.

    If your dataset is confidential, then provide a fake example instead.
    So, you are still asked to give real(istic) data numerically. Not doing so delayed any reply to this.

    The short answer is don't use by() at all. by() enforces separate panels, which is precisely what you don't want. Here is some technique.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(var1 var2 var3)
     0 0 61.82
    -1 0 18.97
     1 0 19.21
     0 1  62.8
    -1 1 30.08
     1 1  7.12
    end
    label values var1 change
    label def change -1 "Decrease", modify
    label def change 0 "Same", modify
    label def change 1 "Increase", modify
    label values var2 var2
    label def var2 0 "Vaccinated", modify
    label def var2 1 "No vaccination", modify
    
    graph hbar (asis) var3, over(var2) over(var1) asyvars scheme(s1color) aspect(1)
    I used a different scheme. The default scheme irritates, or does not appeal to, many users. I notice many complaints about on social media and far fewer comments that you should just use a different scheme if you don't like it.

    Decrease, same, increase is a logical order. You're at liberty to make it illogical again.

    Probability in percent terms needs more explanation than you give it.

    Naturally, almost all of the graphical choices here are debatable.

    Click image for larger version

Name:	noah.png
Views:	1
Size:	26.8 KB
ID:	1478446


    Comment


    • #3
      Thank you Nick!

      Comment


      • #4
        Hi again,
        Is there a way to do this without, as above, manually entering the percentage frequencies of prob_infection_change_sign given vaccinated?

        In particular, consider this example of the structure of the data I have. (Missing values are intentional.)

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str8 prob_infection_change_sign float vaccinated
        "Increase" 1
        "Decrease" .
        "Same"     0
        ""         1
        "Decrease" 0
        "Increase" 0
        "Same"     1
        "Same"     1
        "Increase" .
        "Decrease" 0
        ""         1
        "Decrease" 0
        "Increase" 0
        "Same"     1
        "Same"     1
        "Increase" .
        "Decrease" 0
        ""         1
        "Decrease" 0
        "Increase" 0
        "Same"     1
        "Same"     1
        "Increase" .
        "Decrease" 0
        "Same"     1
        "Same"     0
        "Same"     0
        "Increase" 1
        end
        I'd like to use just this data to be able to code for the same kind of graph as shown, very helpfully, by Nick above. I'd like the bars to represent the percentage frequency of the value of 'change' within the category of 'vaccinated'. So the three vaccinated bars should represent percents that sum to 100%, and the other three non-vaccinated bars should sum to 100%.

        Again, the closest I've come is
        Code:
         
         graph hbar, by(vaccinated) over(prob_infection_change_sign)
        , but this produces 2 sub-graphs where I want one. Alternatively,
        Code:
         graph hbar, over(vaccinated) over(prob_infection_change_sign)
        produces the right format of graph, but it has all 6 bars summing to 1, instead of 2 pairs of 3 summing to 1.

        Any advice would be greatly appreciated!

        Comment


        • #5
          I am trying to wrap my head around "2 pairs of 3" but this may help.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str8 prob_infection_change_sign float vaccinated
          "Increase" 1
          "Decrease" .
          "Same"     0
          ""         1
          "Decrease" 0
          "Increase" 0
          "Same"     1
          "Same"     1
          "Increase" .
          "Decrease" 0
          ""         1
          "Decrease" 0
          "Increase" 0
          "Same"     1
          "Same"     1
          "Increase" .
          "Decrease" 0
          ""         1
          "Decrease" 0
          "Increase" 0
          "Same"     1
          "Same"     1
          "Increase" .
          "Decrease" 0
          "Same"     1
          "Same"     0
          "Same"     0
          "Increase" 1
          end
          
          label def prob 1 "Decrease" 2 "Same" 3 "Increase"
          encode prob, gen(PROB) label(prob)
          label def vacc 0 "Vaccinated" 1 "No vaccination"
          label val vaccinated vacc
          
          set scheme s1color
          * ssc install catplot first
          catplot PROB vaccinated, percent(vaccinated) blabel(bar, format(%2.1f)) ysc(r(0 85))
          Click image for larger version

Name:	vaccination.png
Views:	1
Size:	24.8 KB
ID:	1479727

          Comment


          • #6
            Dear Nick Cox, apologies for reviving this thread, but how would your code in #5 (with the data used in #5 as well) change if we wanted the graph in #5 to be vertical, instead of horizontal, and therefore have "No vaccination" right underneath the corresponding categories in "Vaccinated"? So the scale "percent" would represent a vertical y-axis on the left hand side of the graph.

            Comment


            • #7
              You would need minimally the extra option

              Code:
              recast(bar)
              together with whatever other tweaks are needed to make the graph readable.

              Comment


              • #8
                Thanks, I'll give it a try!

                Comment

                Working...
                X