Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graphing, Labels, and Missing Data

    I am trying to create a graph using the code below but am running into an issue. I'm creating a graph composed of multiple graphs; one for each year between 1989 and 1995. The first two graphs (1989 and 1990) show the category "Below 30" in one color (on my screen, blue). The remaining graphs (1990-1995) don't have any observations falling into the "Below 30" category, and so the first category for these years ("30-50") shows up as the same color blue that was used for the "Below 30" category in the 1989 and 1990 graphs. This makes it difficult to look at trends over time. I'm wondering if there's a way to get all graphs to use the same color and have the same legend.

    I would greatly appreciate any advice Stata experts have about this.

    Thank you,
    Erika

    /*Code*/
    *Using the cigconsump dataset that can be found here http://www.stata-press.com/data/imeus.html.
    cd C:/ado/plus/imeus/

    use ../imeus/cigconsump, clear
    generate region = "N. Centr" if inlist(state, "OH", "MI", "KS", "IA", "WI")
    replace region = "N. East" if inlist(state, "NH", "CT", "RI", "VT", "MA")
    replace region = "South" if inlist(state, "AL", "GA", "LA", "TX", "FL")
    replace region = "West" if inlist(state, "CA", "WA", "OR", "UT", "CO")
    drop if (region == "")

    generate tax_tier = 1 if taxs < 30
    replace tax_tier = 2 if taxs >= 30 & taxs < 50
    replace tax_tier = 3 if taxs >= 50 & taxs < 60
    replace tax_tier = 4 if taxs >= 60 & taxs < 70
    replace tax_tier = 5 if taxs >= 70

    label define tax_tier_label 1 "Below 30" 2 "30-50" 3 "50-60" 4 "60-70" 5 "Over 70"
    label values tax_tier tax_tier_label

    generate ones = 1
    keep if year > 1988

    save ../imeus/cigconsump_cleaned, replace

    forvalues x = 1989/1995{
    use ../imeus/cigconsump_cleaned, clear
    keep if year == `x'
    graph bar (count) ones, over(tax_tier, label(labsize(vsmall))) over(region, label(labsize(vsmall))) stack asyvars title(`x')
    graph save ../imeus/1_`x'.gph, replace
    collapse (sum) ones, by(region tax_tier)
    bys region : egen total_region = total(ones)
    generate p_total_region = round(ones/total_region, 0.01)*100
    graph bar (asis) p_total_region, over(tax_tier, label(labsize(vsmall))) over(region, label(labsize(vsmall))) stack asyvars title(`x')
    graph save ../imeus/2_`x'.gph, replace
    }

    graph combine ../imeus/1_1989.gph ///
    ../imeus/1_1990.gph ///
    ../imeus/1_1991.gph ///
    ../imeus/1_1992.gph ///
    ../imeus/1_1993.gph ///
    ../imeus/1_1994.gph ///
    ../imeus/1_1995.gph

  • #2
    you can set the color of each bar to whatever you want; see -h barlook_options-

    Comment


    • #3
      Thank you for the quick response, Rich. When I add the following line of code after the first graph bar command, nothing changes.

      bar(1, color(gs7)) bar(2, color(stone)) bar(3, color(teal)) bar(4, color(ltblue)) bar(5, color(gray))

      I think the issue is that the first bar category is different depending on the year, and I'm not sure how to address that. Apologies if I am missing something in the help file.

      Best,
      Erika

      Comment


      • #4
        The essence of your problem is that bar charts produced separately don't know anything about others' internals. So, don't do that...

        catplot (SSC) can do what you want without obliging the user to invent the machinery of countings 1s separately, combining graphs, and so forth. Assuming that catplot is installed, this is a self-contained script showing some technique.

        Code:
         
        clear 
        copy http://www.stata-press.com/data/imeus/cigconsump.dta cigconsump.dta, replace 
        u cigconsump 
        
        generate region = "N. Centr" if inlist(state, "OH", "MI", "KS", "IA", "WI")
        replace region = "N. East" if inlist(state, "NH", "CT", "RI", "VT", "MA")
        replace region = "South" if inlist(state, "AL", "GA", "LA", "TX", "FL")
        replace region = "West" if inlist(state, "CA", "WA", "OR", "UT", "CO")
        drop if (region == "")
        
        generate tax_tier = 1 if taxs < 30
        replace tax_tier = 2 if taxs >= 30 & taxs < 50
        replace tax_tier = 3 if taxs >= 50 & taxs < 60
        replace tax_tier = 4 if taxs >= 60 & taxs < 70
        replace tax_tier = 5 if taxs >= 70
        
        label define tax_tier_label 1 "Below 30" 2 "30-50" 3 "50-60" 4 "60-70" 5 "Over 70" 
        label values tax_tier tax_tier_label
        
        catplot tax_tier region if inrange(year, 1989, 1994), by(year) stack asyvars
        At a guess, the dataset you used is just a toy and not central to your real problem, so I've held off suggesting better colours and so forth, but I note that here, as indeed often, horizontal bar charts are friendlier.

        Comment


        • #5
          Thank you for the advice; catplot works wonderfully.

          Comment

          Working...
          X