Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Finding the average age of death in a database in which the same person appears multiple times

    Hello, I have a most likely banal question. I have a database of about 400 000 observations which contains the following information:
    ID age_of_death
    1 50
    1 50
    1 50
    2 .
    2 .
    3 60
    A missing value means that the person is alive, so their age should not enter my calculation at all. I want to calulate the average age at death, but I don't know how to consider only one value for each ID and ignore the missing values. I assumed I needed to use the tag command, but I can't figure out exactly how.
    For clarity, if the above data was my data, the value I need would be: average_age_at_death = (50 + 60)/2

  • #2
    tag() is an egen function, not a command. That aside, it can be part of a solution, and the other part is that summarize ignores missings any way.

    Code:
    clear 
    input ID    age_of_death
    1    50
    1    50
    1    50
    2    .
    2    .
    3    60
    end 
    
    egen tag = tag(ID)
    su age_of_death if tag

    Comment


    • #3
      Have all persons in your dataset died, or are some (hopefully many) of your respondents still alive when the data collection stopped, and thus their age of death unknown?
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment

      Working...
      X