Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Select cases for plot

    I am doing a mixed model for longitudinal data with 13 time points. There are over 90,000 observations, so the line graph his nearly impossible to interpret. Is it possible to select a set number of cases so that it is easier to read?

  • #2
    If I understand this correctly, you want a random subset of panels (cases = observations in my experience) to reduce a spaghetti plot to something with more evident structure.

    Here is some technique.

    Code:
    webuse grunfeld, clear
    
    xtset
    
    xtline invest, overlay ysc(log)  
    
    set seed 31459
    
    gen select = cond(year == 1935, runiform() < 0.5, 0)
    
    bysort company (select) : replace select = select[_N]
    
    xtline invest if select, overlay ysc(log)
    There are 10 companies: I asked for a 50% sample for one year and then extended the selection to all years for the same panels. My code produced a plot with just 5 companies. With different random numbers a different number would not have been surprising.

    The code ignores details specific to the dataset, such as getting better axis labels -- except that ysc(log) is used as natural for that data: it may be useless or even inapplicable for your data; I mention it only because it might be helpful.

    Your numbers will be different and you are likely to need to want to use a much smaller probability of selection.

    A refinement on the idea is that say 40 panels displayed as a 2 x 2 display with say 10 in each might work well enough to give flavour.

    Comment

    Working...
    X