Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Frequency bar graph for multiple binary variables

    I have a string variable (Reasons_Inadmissible) with multiple values that are comma separated. It lists all of the reasons why a case is inadmissible; multiple reasons can be listed in one cell. For example, one cell will read HRM, another will read HRM, LAW, another will read LAW, etc. Other values include REP, LON, OTE and others. I am trying to produce a frequency bar graph which indicates the number of cases where Reasons_Inadmissible is HRM, the number of cases where Reasons_Inadmissible is LAW, etc. In other words, I want a bar chart to indicate the number of cases that invoked a particular reason. Individual cases can be double-counted in this bar chart. I tried:

    catplot Reasons_Inadmissible

    But, it produces a bar chart with all possible combinations (i.e. HRM; HRM, LAW; HRM, OTE; LAW; OTE). Instead, I want the bars to represent the number of cases where one of the reasons was listed (just HRM, LAW, REP, OTE-- individually).

    One options would be to generate a series of binary variables that are equal to 1 for each Reasons_Inadmissible value. For example:
    gen HRC = 1 if strpos(Reasons_Inadmissible,"HRC")

    gen LAW = 1 if strpos(Reasons_Inadmissible,"LAW")

    However, in that case, how do I generate a bar graph, where the bars are counts of the different binary variables? Alternatively, is there another way you would recommend generating this graph from a comma-separated categorical variable?

    Thanks!

    Erica

  • #2
    catplot is from SSC, as you are asked to explain (FAQ Advice #12).

    This example makes use of tabm from tab_chi (SSC).

    Code:
    clear 
    set obs 1
    generate str whatever = "frog, toad, newt" in 1
    set obs 2
    replace whatever = "frog, toad" in 2
    set obs 3
    replace whatever = "newt" in 3
    set obs 4
    replace whatever = "dragon" in 4
    set obs 5
    replace whatever = "frog, dragon" in 5
    split whatever, parse(,) 
    tabm whatever?, replace transpose 
    rename _values answer
    replace answer = trim(answer) 
    set scheme s1color 
    graph hbar, over(answer, sort(1) descending)

    Comment


    • #3
      Thank you so much for your response. I have been reading the documentation for tab_chi and tabm (SSC), but am not sure how to proceed with my example. At the most basic level, I'm not sure where the data comes from given that the first step indicated is "clear".

      Comment


      • #4
        Sorry, but I don't know precisely what you're asking here. The code in #2 is self-contained and works. You can add any number of list or edit or other commands to see what is happening at each stage.

        I couldn't use your data because you didn't provide an example and it was easier for me to make one up. If you're struggling with what to do, your code should start at the split command with your own variable name. Also, if you give a real or realistic data example, then the code can be adapted to show how it works with your data.

        Please do read https://www.statalist.org/forums/help#stata for what we mean by a data example.
        Last edited by Nick Cox; 25 Feb 2019, 12:35.

        Comment

        Working...
        X