Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • imputing missing values based on categories of other variables


    level typeofA typeofB typeofC var
    A x . . .
    A y . . 55
    B . p . 10.5
    B . q . .
    B . r . 20
    C . . s 43.1
    C . . t .
    I have a dataset that looks something like this. There are levels, and for each level, there are different types. To impute missing values for “var”, I want to use the mean (or median) of “var” in the following way.

    If there is a missing value in var:
    if the observation belongs to level A
    • If typeofA = x, then impute with mean(var) of x
    • if typeofA = y, then impute with mean(var) of y
    • and so on
    and similarly for B and C......

    There are a lot of types for each level, so i don't know if I can hardcode this. Maybe I have to use a loop for this, but I am really lost and don’t know where to begin.

  • #2
    You need to either provide a more elaborate example or explain better what you want to achieve.

    If there is a missing value in var:
    if the observation belongs to level A
    • If typeofA = x, then impute with mean(var) of x
    In your table, there is only one observation with "x", and it has a missing value on "var". What do you mean with "impute with mean(var) of x"?

    Comment

    Working...
    X