Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graphing trends for only a handful of cases per group?

    Hi all,

    I've got a dataset with a three level hierarchical structure that is giving me some trouble. You can see the general idea here:


    Code:
    webuse productivity
    graph twoway (scatter gsp year, connect(ascending)) if region<=2, by(region)
    In this case, there are several lines, each corresponding to a state that is nested within a region of the USA. How do I make it so stata will only include the first 2 states instead of all of them?


    ----

    For reference, the real case I'm working with is schools with students, each of whom has multiple observations. Some schools have a hundred plus students, so graphs get messy. I'm trying to get a look at a handful of student's trends for each school, side by side, instead of all the students per school.

    Thank you for any ideas!




  • #2
    Code:
    bys region (state): gen tag= sum(state!=state[_n-1])<=2
    twoway (scatter gsp year if tag, connect(ascending)), by(region)

    Comment


    • #3
      Thank you so much, I was able to solve my problem using that tagging command. I had tried something similar but this is much better than my initial attempts!

      Comment


      • #4
        This problem arises in many forms. The serious question is how to select the students you show. Sometimes a fair answer is to select students at or near the minimum, median and quartiles, and maximum on some criterion of interest. But the devil is in the details.

        Comment


        • #5
          I understand and agree with that point, Nick. Some day I'll probably dig into this more by looking into randomizing the order of the students first so that the first few students for each school isn't biased. But there is also value in being more selective and focusing on interesting cases instead of random ones.

          Comment

          Working...
          X