Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing values and sample selection

    I have data on 300,000 graduates and am running an ordered probit regression to see how degree class achieved relates to various social background characteristics. There are many missing values within my dataset and I am looking to test whether the probability of a value being missing can be predicted by my main variables, i.e. are people more likely to report a missing value if they are attending a less selective university. I am unsure on how to look for such sample selection within stata.

    Thank you.

  • #2
    Create new 0/1 variables indicating missing values. For example:

    Code:
    gen sex_missing = missing(sex)
    gen age_missing = missing(age)
    etc. Then you can look for associations between those new variables and whatever variable identifies the selectivity of their university using some appropriate model (perhaps a cross tab, or a regression of same kind--you don't say how you operationalized "selective unviersity.")

    Comment


    • #3
      Hi Clyde,

      Thank you for your response.

      Within my ordered probit model I have various dummy variables representing certain characteristics, i.e. social class, ethnicity etc and have also created three dummy variables for university selectiveness; "oxbridge", "selective" and "other". Would it be appropriate to regress say missing(social class) on all the relevant variables in my model?

      Thank you,

      Abigail

      Comment


      • #4
        Yes.

        Comment

        Working...
        X