Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graphing Medians Panel Data

    Hello,
    I'd like to graph the median values of a variable. The data I have looks like this, and I would like to graph median GPA over time by race and by school. There are no unique identifiers for individuals. The survey design is a rotating panel.
    weight year age school race gpa
    100 1910 5 elementary white 3
    200 1910 10 elementary black 2
    300 1910 12 middle other 4
    400 1911 14 middle black 3
    100 1911 18 high black 4
    300 1911 32 no college other 2
    The way I've done it so far is like this, but I am not sure if it was okay to do so. Tsline is for time series data, and this panel data.
    Code:
    gen id = _n
    
    xtset id year
    
    preserve 
    collapse (median) wealth [weight=weight], by(year race school)
    graph twoway tsline wealth, by(race school)
    I could use xtline (with the option "overlay") instead, but I still don't know if it was appropriate to use and create an "id". What should I do? Thank you!

  • #2
    Where it says "wealth" it should say "gpa" instead (not sure how to edit the post)
    Code:
    collapse (median) gpa [weight=weight] by(year race school) 
    graph twoway tsline gpa, by(race school)

    Comment


    • #3
      So, what's the question precisely? I guess you're getting at least 12 lines in principle (at least 3 races and 4 distinct values of school), which could be a mess, but what do you seek from us?
      Last edited by Nick Cox; 23 Oct 2021, 07:52.

      Comment


      • #4
        Hi Nick,
        Thank you for your response.

        Here are the questions:
        1. Econometrically speaking, is it okay to assign IDs like that even though there might actually be repeated individuals? (The panel itself is rotating panel, meaning that the individuals do get surveyed again).
        2. If (1) is yes and this is okay, can I use the "tsline" command even though it's panel data rather than time series data?
        3. If (1) is yes, how do I get all 12 lines onto one graph, perhaps color-coding by race?

        4. If (1) is no, would the following code work?

        Code:
        bysort year race school: egen median_gpa = median(gpa)
        graph twoway line median_gpa year [weight=weight], by(race school)
        The graphs for (1) and (4) do look different, so I'm worried I'm doing something wrong.

        5. The best question is thus: What is the best way to graph median GPA by race and education over years?


        Thank you!
        Edit: and 6.: am I weighting correctly?
        Last edited by John Singer; 23 Oct 2021, 08:17.

        Comment


        • #5
          Does econometrically speaking differ here from statistically speaking?

          A median over observations could well differ from a median over individuals. Your choices!

          egen does not support weights so Is useless here for the medians if you have weights. Putting weights in the line command won’t correct that neglect.

          I think all you’ve told us about weighting is that you have a variable that is a weight, so I don’t see how we can comment on the correctness of what you are doing.


          tsline won’t match your data structure, I think.

          Several lines on one graph is a standard application of line but separate will help.

          Comment


          • #6
            Thank you! I decided on the "graph twoway line" option over tsline since it doesn't require me to declare the variable a panel. I don't have any more information on the weights (the dataset is new to me), so I've just been browsing around to learn how to deal with them (for example, I'm looking at the svyset commands)

            Comment

            Working...
            X