Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • how do I compare different variables over a treated variable in a bar chart?

    Dear all,
    I am stack as to how to put all this bar count into one graph.

    graph bar (count) if male==0, over(treated)
    graph bar (count) if male==1, over(treated)
    graph bar (count) if skill==0, over(treated)
    graph bar (count) if skill==1, over(treated)
    graph bar (count) if young==0, over(treated)
    graph bar (count) if young==1, over(treated)

    I want the graph look like below, kindly assist.
    Click image for larger version

Name:	bar.png
Views:	1
Size:	20.6 KB
ID:	1721529
    Last edited by Sulemana Abdul-Karim; 23 Jul 2023, 08:02.

  • #2
    possibly a start.
    Code:
    sysuse auto, clear
    
    g male = foreign
    g female = 1-male
    g skilled = rep78<=3
    g unskilled = 1-skilled
    g treated = runiform()>0.5
    
    g group = 1*male + 2*skilled + 3*unskilled + 4*female
    graph bar price if ~treated , over(group)
    graph bar price if treated , over(group)

    Comment


    • #3
      Thanks Ford. I have been working on it since the time you replied, but I still get something like a joint variable (eg. skilled and male). I really want a stand alone variables. Thanks. Kindly assist

      Comment


      • #4
        On the face of it you want the graphical equivalent of three 2 x 2 tables, although the order of bars in your example

        control male
        control skilled
        control unskilled
        control female

        treated male
        treated skilled
        treated unskilled
        treated female

        makes no obvious sense to me. specifically why young and old do not appear and why male and female are not next to each other, and more crucially why control and treated are not always next to other. Whatever you want to compare most closely should usually be next to each other in a bar or dot chart.

        There is no data example here, which is always discouraging to people who answer questions.

        I made up a dataset with what I guess is your structure and then after various experiments concluded that you need a different data structure to do even a halfway decent job,


        Code:
        clear 
        set obs 100 
        set seed 2803 
        gen male = runiform() > 0.3 
        gen skill = runiform() > 0.4 
        gen young = runiform() > 0.5 
        gen treated = runiform() > 0.6 
        
        label def male 1 male 0 female 
        label def skill 1 skilled 0 unskilled
        label def young 1 young 0 old 
        label def treated 1 treated 0 control 
        
        foreach v of var * {
            label val `v' `v'
        }
        
        * start here 
        preserve 
        
        stack male treated skill treated young treated, into(which treated) clear 
        
        label def _stack 1 male 2 skilled 3 young  
        label val _stack _stack 
        
        label def treated 1 treated 0 control 
        label val treated treated 
        
        gen whichpc = 100 * which 
        graph bar whichpc, over(_stack) over(treated) ytitle(percent) blabel(bar, format(%2.1f)) name(G1, replace)
        
        graph bar whichpc, over(treated) over(_stack) ytitle(percent) blabel(bar, format(%2.1f)) name(G2, replace)
        
        gen WHICH = which + (_stack - 1) * 2 
        label def WHICH 0 female 1 male 2 unskilled 3 skilled 4 old 5 young 
        label val WHICH WHICH 
        
        graph bar (count) , over(WHICH) over(treated) asyvars name(G3, replace)
        
        graph bar (count), over(treated) over(WHICH) by(_stack, note("") row(1)) asyvars nofill name(G4, replace) /// 
        subtitle("", pos(9) fcolor(none) nobexpand)
        Various generic and specific points arise from these examples.

        1. Whenever you have a (0, 1) binary variable, the proportion or percent with value 1 encodes all the information, as the other proportion or percent is the complement. G1 and G2 use this idea. G2 is to my mind uses a better order than G1.

        2. Horizontal bars (not shown) are often clearer.

        3. There are many recipes for using contrasting colours, or not.

        Here is G1 first.
        Click image for larger version

Name:	msy_G1.png
Views:	1
Size:	26.1 KB
ID:	1722009

        Last edited by Nick Cox; 27 Jul 2023, 05:46.

        Comment


        • #5
          G2 shows the same results as G1 but with control and treated next to each other.


          G3 seems closest of these to what you're asking for. I don't like the use of a legend. As I don't think this is a good idea, I didn't try to improve the details.

          G4 seems preferable to me.

          An over-arching comment is that while graph combine often seems a good idea, it can be even better to restructure so that you can use a by() option. More at https://journals.sagepub.com/doi/pdf...36867X20976341

          EDIT: The forum software seems to be playing up, or I am dopey today.

          This is G2. G3 and G4 will follow in separate posts.
          Click image for larger version

Name:	msy_G2.png
Views:	1
Size:	25.8 KB
ID:	1722018

          Last edited by Nick Cox; 27 Jul 2023, 05:38.

          Comment


          • #6
            G3 is here
            Click image for larger version

Name:	msy_G3.png
Views:	1
Size:	25.3 KB
ID:	1722020

            Comment


            • #7
              Here is G4 finally
              Click image for larger version

Name:	msy_G4.png
Views:	1
Size:	22.7 KB
ID:	1722022

              Comment


              • #8
                Thanks a lot Nick Cox. your concerns were all valid, I should have presented the question correctly. However your code and graphs has answered my question in so many different ways. Thanks.

                Comment

                Working...
                X