Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dot plot for two categorical variables

    Hello, I have the following dataset:

    clear
    input str1 item year1 year2 year3 year4 year5
    "A" 1 0 0 0 1
    "B" 0 0 0 1 1
    "C" 1 1 1 1 1
    "D" 1 0 0 0 0
    "E" 1 1 1 0 0
    end

    The data shows that, for instance, item A occurs in year 1 and year 5 but not in year 2, 3 and 4.

    I would like to show this in a graph where the item variable is on the y-axis and the year1, year2, year3, year4 and year5 variables are on the x-axis. For item A, for instance, there would be a marker for year 1 and year 5 but there would be no marker for year 2, 3 and 4.

    Ideally, the y-axis would be sorted according to the sum of items' occurences, so item D would be at the bottem end of the y-axis and item C at the top end of the y-axis.

    I have tried graph dot and twoway dot with multiple versions of reshaped data but I cannot produce the graph I want.

    Any suggestions? Thank you!

  • #2
    Here's one approach:

    Code:
    clear
    input str1 item year1 year2 year3 year4 year5
    "A" 1 0 0 0 1
    "B" 0 0 0 1 1
    "C" 1 1 1 1 1
    "D" 1 0 0 0 0
    "E" 1 1 1 0 0
    end
    
    rename (year*) (whatever*)
    reshape long whatever, i(item) j(year)
    
    egen count = total(whatever), by(item)
    
    egen item2 = group(count item)
    
    * labmask is from the Stata Journal and must be installed before you can use it 
    labmask item2, values(item)
    
    su item2, meanonly 
    
    set scheme s1color 
    scatter item2 year if whatever , yla(1/`r(max)', valuelabel ang(h) noticks) ytitle(item)

    Click image for larger version

Name:	scatter_dot.png
Views:	1
Size:	12.7 KB
ID:	1549492

    Comment


    • #3
      crossed (and essentially the same) as the solution from Nick. Mine requires egenmore from SSC.
      Code:
      clear
      input str1 item year1 year2 year3 year4 year5
      "A" 1 0 0 0 1
      "B" 0 0 0 1 1
      "C" 1 1 1 1 1
      "D" 1 0 0 0 0
      "E" 1 1 1 0 0
      end
      
      reshape long year, i(item) j(yearx)
      rename year present
      rename yearx year
      
      bys item: egen total = total(present)
      egen y = axis(total item), label(item)
      scatter y year if present==1 , ylab(1/5, value)
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment


      • #4
        Oddly enough, axis() is an egen function I wrote, but now never use. No matter either way.

        More interesting to note the close similarity of solutions. The key here is to notice that you really need a different data structure to make this easy, although more difficult solutions are possible too.

        Comment

        Working...
        X