Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replace Missing Averages with Values from Another Cell

    I'm trying to get the average from three variables in my dataset.
    Code:
    input byte(Country Country_OLD) int year byte COW_Code_A int COW_Code_N byte WordBank_Code_A int UN_Code_N byte(Region PTS_A PTS_H PTS_S NA_Status_A NA_Status_H NA_Status_S) float(COWCODEN ccode2) long(PTSA PTSS) float pts_average
    . . 2000 .  2 . 840 . 1 . .  0 99 88 13  2 1 6         .
    . . 2001 .  2 . 840 . 2 . .  0 99 88 13  2 2 6         .
    . . 2002 .  2 . 840 . 2 . .  0 99 88 13  2 2 6         .
    . . 2003 .  2 . 840 . 2 . .  0 99 88 13  2 2 6         .
    . . 2004 .  2 . 840 . 3 . .  0 88 88 13  2 3 6         .
    . . 2005 .  2 . 840 . 3 . .  0 88 88 13  2 3 6         .
    . . 2006 .  2 . 840 . 3 . .  0 88 88 13  2 3 6         .
    . . 2007 .  2 . 840 . 3 . .  0 88 88 13  2 3 6         .
    . . 2008 .  2 . 840 . 3 . .  0 88 88 13  2 3 6         .
    . . 2009 .  2 . 840 . 3 . .  0 88 88 13  2 3 6         .
    . . 2010 .  2 . 840 . 3 . .  0 88 88 13  2 3 6         .
    . . 2011 .  2 . 840 . 3 . .  0 88 88 13  2 3 6         .
    . . 2012 .  2 . 840 . 3 . .  0 88 88 13  2 3 6         .
    . . 2013 .  2 . 840 . . 2 . 88  0 88 13  2 6 6         .
    . . 2014 .  2 . 840 . 2 2 .  0  0 88 13  2 2 6         .
    . . 2015 .  2 . 840 . 2 3 .  0  0 88 13  2 2 6         .
    . . 2016 .  2 . 840 . 2 3 .  0  0 88 13  2 2 6         .
    . . 2000 . 20 . 124 . 1 . 1  0 88  0 14 20 1 1         .
    . . 2001 . 20 . 124 . 1 . 1  0 88  0 14 20 1 1         .
    . . 2002 . 20 . 124 . 1 . 2  0 88  0 14 20 1 2         .
    . . 2003 . 20 . 124 . 1 . 1  0 88  0 14 20 1 1         .
    . . 2004 . 20 . 124 . 1 . 1  0 88  0 14 20 1 1         .
    . . 2005 . 20 . 124 . 1 . 1  0 88  0 14 20 1 1         .
    The average is coded as
    Code:
    gen pts_average=(PTS_A+PTS_S+PTS_H)/3
    Naturally, when one of the variables, either PTS_A, PTS_H, PTS_S is missing, the value for pts_average is also missing. What I'm trying to do is replace pst_average with the value of PTS_A (or PTS_S, PTS_H) if one is available, but keep it missing if all (PTS_A, PTS_S, and PTS_H) are missing. For example, the average for the first show should be 1. The average for row 15 (year 2014, COW_CODE_N=2) should be two, etc.


    I found some posts about replacing missing values with values from another cell, but I don't believe it will work given that the new variable is an average, but can be replaced with the value of another cell(s) if the average is missing.

    Many thanks in advance.

  • #2
    Hi Monica
    why dont you try something like this
    Code:
    egen pts_avg=rowtotal(PTS_A PTS_S PTS_H)
    I think it does what you want
    Best
    Fernando

    EDIT: Thank you Clyde. I have no idea why i wrote rowtotal rather than rowmean.
    Last edited by FernandoRios; 16 Feb 2019, 14:53.

    Comment


    • #3
      Code:
      egen pts_average = rowmean(PTS_A PTS_S PTS_H)
      Added: Crossed with #2 which provides the sum, rather than the mean, of the as many of the three variables as are non-missing.

      Comment


      • #4
        You can also see a similar challenge to yours on Statalist here

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          Code:
          egen pts_average = rowmean(PTS_A PTS_S PTS_H)
          Added: Crossed with #2 which provides the sum, rather than the mean, of the as many of the three variables as are non-missing.
          Ah, yes! rowmean. Thanks so much, Clyde, and to Fernando as well.

          Comment

          Working...
          X