Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a graphic representation of bivariate frequency distribution (crosstab)

    I hope someone out there can help me. I am trying to find a way to graphically represent a bivariate relationship that is a little more nuanced than a simple scatterplot or line graph.
    What I am trying to accomplish is a graph with 3 components (x,y,z) that essentially would visually represent the information in a crosstab, where x would be variable one, y variable two, and z would be the frequency of cases occurring in any given combination of x and y.

    The only two graphs I have found in Stata that support three elements are the contour heat map or a weighted scatterplot. The problem with either is that the z component of contour (or weight of scatterplot) has to be a separate variable. Most of the variables that I want to graph have a range of 100, which creates 10,000 possible combinations. Hence, I cannot easily create a new variable to solve my problem.

    Does anyone know of a graphing option within stata or a downloadable ado or even a 3rd party graphing option that can intuitively graph a bivariate frequency distribution, giving me the nuanced bivariate graph I am looking for?

  • #2
    I don't think of it as "nuanced". I would not be surprised if it were called "naive". But tabplot (SSC) is one candidate here, which I have been tinkering with on and off since 1999.

    Code:
    sysuse auto
    tabplot foreign rep78, showval
    Whether it works well in your view for 100 rows and 100 columns is another matter. You'd need to suppress the showval option.

    More importantly, your statement

    Most of the variables that I want to graph have a range of 100, which creates 10,000 possible combinations. Hence, I cannot easily create a new variable to solve my problem.
    appears quite wrong to me. Here we go:

    Code:
    bysort x y : gen z = _N

    Comment


    • #3
      Thank you Nick. tabplot is a useful option to have if I collapse my data a bit, and your code for creating a bivariate frequency variable is spot on and much appreciated.

      Comment


      • #4
        Matthew,

        In addition to contour, you can also try surface (from SSC). I am presenting at the Stata Conference in late July on an interpolation pre-processor for surface that might also be of interest. I can send you the beta version if you are interested.

        Regards,
        Joe

        Comment

        Working...
        X