Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graphing response percentages by race and gender in a bar chart

    Hi All,

    I want to create a graph that shows within each combination of race and gender what percentage of subjects responded "Totally Untrue," "Mostly Untrue," or "Somewhat" (i.e., option 1, 2, or 3) to item1. Thus, it is theoretically possible that each bar could reach 100% on its own, if, for example, if all white females responded 1, 2, or 3, to item1 that bar would reach 100% on the graph.

    Here is the code I used to create a dataset that mimics my real dataset, as well as the code I used to create the graph:

    Code:
    clear
    set seed 123456
    set obs 2000
    
    gen item1 = int(uniform() * 5) + 1
    
    gen male = int(uniform() * 2)
    
    gen race = int(uniform() * 4) + 1
    
    label var item1 "I love science."
    
    label define anslblitem1  1 "Totally Untrue" 2 "Mostly Untrue" 3 "Somewhat" ///
      4 "Mostly True" 5 "Totally True"
    
    label values item1 anslblitem1
    
    label define racelbl 1 "White" 2 "Black" 3 "Hispanic" 4 "Asian"
    label values race racelbl
    
    label define malelbl 1 "Males" 0 "Females"
    label values male malelbl
    
    graph bar if item1 < 4 & item1!=., over(male) over(race) asyvars
    This is the graph I get:
    Click image for larger version

Name:	Screen Shot 2017-01-18 at 2.54.26 PM.png
Views:	1
Size:	33.3 KB
ID:	1370768

    Which looks exactly how I want it to, but the percentages don't seem to represent what I want them to represent. It seems that the graph is showing what percentage of the larger group of subjects who responded "Totally Untrue," "Mostly Untrue," or "Somewhat" are White females, etc.

    Previously, I had tried to create an indicator variable denoting whether a respondent was below 4 on item1, something like:

    Code:
    gen item1_low = 1 if item1 < 4 & item1!=.
    But I couldn't find a way to create a graph to my liking using this variable.

    I'm pretty sure I'm missing something obvious, but I've been in the weeds on this for a while and thought I could use a fresh (and highly qualified) perspective.

    Thanks in advance!

    -Jake

  • #2
    Are the percents of 1, 2, or 3 to be calculated from the total of 1, 2, 3 or from the total of 1, 2, 3, 4, 5 or that of 1, 2, 3, 4, 5 and missing? (You don't seem to have missings in your example data, but that may not be true in your real data).

    Comment


    • #3
      Good question. There are missing values in the dataset. I tried to simplify my code to create the test dataset and seemed to have cut out the part where I generated instances where respondents had missing values for item1.

      However, what I'm looking to show with the White/female bar, for example, is the following: among white females with nonmissing data for item1, what percentage responded 1, 2, or 3 (in other words, did not respond 4 or 5)? This group is collapsed into one bucket, sorry that was not clear in my initial post (or at least would not have been the intuitive reading). Does that help clarify?

      (The same applies to all race/gender combinations, I just thought sticking with one example would make it easier.)




      Comment


      • #4
        Using some other online resources and playing around with things I believe I have achieved my goal.

        I ended up limiting my focus to response options 1 and 2 (so those respondents with nonmissing data whom did not respond 3, 4, or 5 to each item).

        Item response options are as follows: 1 "Never" 2 "Usually Not" 3 "Sometimes" 4 "Usually" 5 "Always"

        Here is the code I used, in case it could help someone else with a similar aim. I like to use foreach statements so I can graph all of the items with the same response options and polarity at once.

        Code:
        foreach x of varlist item1 item2 item3 {
        preserve
        drop if `x' ==.
        gen `x'_low = 1 if `x' < 3
        bysort male race: gen tot = _N
        collapse (mean) tot (sum) `x'_, by(`x' male race)
        bysort male race: egen `x'_lowtot = sum(`x'_low)
        gen `x'_lowpct = (`x'_lowtot / tot) * 100
        #delimit ;
        graph bar `x'_lowpct if `x'< 3 & `x'!=., 
        over(male) over(race) asyvars
        ytitle("Student Response Percentage")
        #delimit cr
        restore
        }
        Here is an example of what it produces:

        Click image for larger version

Name:	Screen Shot 2017-01-19 at 12.27.37 AM.png
Views:	1
Size:	67.1 KB
ID:	1370834

        Best,
        Jake

        Comment

        Working...
        X