Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Catplot column percentage

    Hello all,

    I am trying to produce a vertical bar graph using 3 categorical variables:

    1. numeracy (measure of respondents' financial literacy level): 5 categories
    2. gender: 2 categories
    3 country: 4 categories

    I want to display the percentage distibution for male and female that have numeracy==5 in each country in a single graph (for each country I want two bars, one for male and one for female).
    Using
    Code:
    tab3way numeracy gender country, colpct
    I obtain the following table with the values that I want to plot (e.g. for Austria I want to plot two bars: 28.89% for male and 20.52% for female).

    --------------------------------------------------------------------------------
    | Country identifier and Male or female
    | --- Austria -- --- Germany -- --- Sweden --- - Netherland -
    numeracy | Male Female Male Female Male Female Male Female
    ----------+---------------------------------------------------------------------
    1 | 177 369 277 525 127 329 175 366
    | 5.95 8.86 5.97 9.95 3.24 7.45 4.82 8.54
    |
    2 | 73 209 201 434 128 336 141 368
    | 2.46 5.02 4.33 8.23 3.27 7.61 3.88 8.58
    |
    3 | 509 888 930 1295 1172 1541 797 1384
    | 17.12 21.31 20.05 24.55 29.94 34.90 21.94 32.28
    |
    4 | 1355 1846 1959 2108 1221 1442 1115 1215
    | 45.58 44.30 42.23 39.95 31.20 32.65 30.70 28.34
    |
    5 | 859 855 1272 914 1266 768 1404 954
    | 28.89 20.52 27.42 17.32 32.35 17.39 38.66 22.25
    --------------------------------------------------------------------------------

    By using this
    Code:
    mylabels 0(20)80, myscale(@) local(pctlabel) suffix("%")
    splitvallabels numeracy, length(5) nobreak recode
    catplot gender country numeracy if numeracy==5, percent(country) legend(ring(0) position(10)) asyvars recast(bar) var2opts(label(ang(45))) ytitle("")
    I obtained the row percentages, I want to obtain a similiar graph but with column instead of row percentages.

    Thanks for your help!
    Elisa

  • #2
    Code:
    tab3way
    mylabels
    splitvallabels
    catplot
    are all community-contributed commands, so by FAQ Advice #12 you are asked to explain where they come from.

    Three bigger deals here:

    1. You don't give a data example which would allow your problem to be reproduced. With a combination of mechanical engineering and brain surgery I could make your table into a dataset.

    2. if you want catplot to show the percents for some breakdown other than what it shows, you need to supply that as a weight somehow.

    3. If numeracy is constant in the data supplied to your graph, it need not also be a variable in the command.

    All that said, I played with catplot (SSC) half-heartedly before switching to tabplot (Stata Journal). I am the author of both, so trust no-one will be offended if I say that tabplot is the better and more useful command. Above all, it avoids the greatly over-sold and over-used stacking of bars,

    Also, it seems to me that only one overall analysis makes much sense: to regard country and gender as descriptive or predictor variables and numeracy as the response or outcome.

    Further, I am not completely sure which way your scale runs. Is 5 good or poor? Either way, the axis here can be reversed if it's the wrong way round.

    PS putting text at 45 degrees is an indicator of design failure....

    Here's my code and first different graph suggestion:

    Code:
    clear
    input Numeracy fAustria0 fAustria1 fGermany0 fGermany1 fSweden0 fSweden1 fNetherlands0 fNetherlands1
    1  177 369 277 525 127 329 175 366
    2  73 209 201 434 128 336 141 368
    3  509 888 930 1295 1172 1541 797 1384
    4  1355 1846 1959 2108 1221 1442 1115 1215
    5  859 855 1272 914 1266 768 1404 954
    end
    
    reshape long f, i(Numeracy) j(which) string
    gen country = substr(which, 1, length(which) - 1)
    gen female = real(substr(which, -1, 1))
    drop which
    label def female 0 Male 1 Female
    label val female female
    label var f "Frequency"
    
    set scheme s1color
    
    catplot female country if Numeracy == 5 [fw=f], recast(bar) bar(1, fcolor(green*0.2)) name(G1, replace)
    
    tabplot Numeracy female [fw=f], by(country, compact row(1) note("")) xtitle("") ///
    percent(country female) showval name(G2, replace) subtitle(, fcolor(none)) ///
    separate(female) bar1(blcolor(blue) bfcolor(blue*0.2)) ///
                     bar2(blcolor(red) bfcolor(red*0.2))
    Click image for larger version

Name:	numeracy.png
Views:	1
Size:	26.8 KB
ID:	1475577


    Still, that's mostly me playing. Here are a few steps towards what I think you want:


    Code:
    egen den = total(f), by(country female)
    egen num = total(f), by(country Numeracy female)
    gen pc = 100 * num/den
    tabplot country female [iw=pc] if Numeracy==5 , showval(format(%2.1f)) ///
    xtitle("") ytitle("") subtitle(% at level 5) /// 
    separate(female) bar1(blcolor(blue) bfcolor(blue*0.2))  bar2(blcolor(red) bfcolor(red*0.2))

    Click image for larger version

Name:	numeracy2.png
Views:	1
Size:	16.3 KB
ID:	1475583

    Last edited by Nick Cox; 19 Dec 2018, 12:05.

    Comment

    Working...
    X