Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping obsersations for a variable using specific IF conditions

    Dear STATA forum members,

    I am stuck with the following problem. I am working with country year panel data, where I have household survey data for each country. Each country can have the following values for the variable inc_con: i,I,c,C. The first two are income measures, the latter are consumption measures. Small letter means its microdata, capital letter means it is grouped data. To be consistent for each country, I would like to keep either consumption OR income data for that country, meaning that for each country I woule like to have either the values "i" and"I" OR "c" and C" for inc_con. Below I have posted an example of my data, where I refer to Armenia and Belarus as an illustrative example. Armenia has only 1 observation for "I", while Belarus has 2 observations for "I". The rest of the observations all have values of "c" for inc_con. For these countries therefore I would like to delete the observations for which inc_con="I"

    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str38 country double(year mean) str1 inc_con
    "Angola" 2000 136.7 "c"
    "Angola" 2008 120.74 "c"
    "Albania" 1996 187.95 "c"
    "Albania" 2002 192.1 "c"
    "Albania" 2005 217.16 "c"
    "Albania" 2008 237.52 "c"
    "Albania" 2012 225.28 "c"
    "Argentina--Urban" 1986 863.01 "I"
    "Argentina--Urban" 1987 788.87 "I"
    "Argentina--Urban" 1991 559.91 "i"
    "Argentina--Urban" 1992 564.61 "i"
    "Argentina--Urban" 1993 566.02 "i"
    "Argentina--Urban" 1994 558.64 "i"
    "Argentina--Urban" 1995 530.86 "i"
    "Argentina--Urban" 1996 520.07 "i"
    "Argentina--Urban" 1997 544.81 "i"
    "Argentina--Urban" 1998 529.15 "i"
    "Argentina--Urban" 1999 504.49 "i"
    "Argentina--Urban" 2000 493.44 "i"
    "Argentina--Urban" 2001 454.26 "i"
    "Argentina--Urban" 2002 319.21 "i"
    "Argentina--Urban" 2003 438.71 "i"
    "Argentina--Urban" 2004 473.56 "i"
    "Argentina--Urban" 2005 523.22 "i"
    "Argentina--Urban" 2006 585.2 "i"
    "Argentina--Urban" 2007 601.48 "i"
    "Argentina--Urban" 2008 615.38 "i"
    "Argentina--Urban" 2009 640.43 "i"
    "Argentina--Urban" 2010 659.68 "i"
    "Argentina--Urban" 2011 707.31 "i"
    "Argentina--Urban" 2012 704.11 "i"
    "Argentina--Urban" 2013 705.86 "i"
    "Argentina--Urban" 2014 648.49 "i"
    "Argentina--Urban" 2016 707.36 "i"
    "Armenia" 1996 162.45 "I"
    "Armenia" 1999 122.43 "c"
    "Armenia" 2001 116.33 "c"
    "Armenia" 2002 122.03 "c"
    "Armenia" 2003 125.42 "c"
    "Armenia" 2004 155.5 "c"
    "Armenia" 2005 171.34 "c"
    "Armenia" 2006 165.38 "c"
    "Armenia" 2007 184.21 "c"
    "Armenia" 2008 188.28 "c"
    "Armenia" 2009 167.35 "c"
    "Armenia" 2010 170.84 "c"
    "Armenia" 2011 174.4 "c"
    "Armenia" 2012 180.67 "c"
    "Armenia" 2013 187.46 "c"
    "Armenia" 2014 192.27 "c"
    "Armenia" 2015 206.11 "c"
    "Armenia" 2016 216.71 "c"
    "Australia" 1981 1041.64 "i"
    "Australia" 1985 1077.66 "i"
    "Australia" 1989 1107.21 "i"
    "Australia" 1995 1019.97 "i"
    "Australia" 2001 1175.25 "i"
    "Australia" 2003 1211.29 "i"
    "Australia" 2008 1667.43 "i"
    "Australia" 2010 1661.81 "i"
    "Austria" 2003 1490.31 "i"
    "Austria" 2004 1572.74 "i"
    "Austria" 2005 1503.04 "i"
    "Austria" 2006 1538.62 "i"
    "Austria" 2007 1602.48 "i"
    "Austria" 2008 1629.29 "i"
    "Austria" 2009 1687.21 "i"
    "Austria" 2010 1704.26 "i"
    "Austria" 2011 1672.62 "i"
    "Austria" 2012 1634.39 "i"
    "Austria" 2013 1714.46 "i"
    "Austria" 2014 1688.03 "i"
    "Austria" 2015 1681.71 "i"
    "Azerbaijan" 1995 171.36 "c"
    "Azerbaijan" 2001 209.21 "c"
    "Azerbaijan" 2002 243.12 "c"
    "Azerbaijan" 2003 249.76 "c"
    "Azerbaijan" 2004 253.09 "c"
    "Azerbaijan" 2005 251.19 "c"
    "Burundi" 1992 42.8 "C"
    "Burundi" 1998 39.29 "c"
    "Burundi" 2006 47.74 "c"
    "Burundi" 2013 54.41 "c"
    "Belgium" 2003 1312.17 "i"
    "Belgium" 2004 1415.91 "i"
    "Belgium" 2005 1414.29 "i"
    "Belgium" 2006 1390.35 "i"
    "Belgium" 2007 1430.76 "i"
    "Belgium" 2008 1449.96 "i"
    "Belgium" 2009 1464.98 "i"
    "Belgium" 2010 1454.17 "i"
    "Belgium" 2011 1418.73 "i"
    "Belgium" 2012 1459.04 "i"
    "Belgium" 2013 1456.01 "i"
    "Belgium" 2014 1473.47 "i"
    "Belgium" 2015 1490.91 "i"
    "Benin" 2003 78.63 "c"
    "Benin" 2011 82.42 "c"
    "Benin" 2015 82.56 "c"
    "Burkina Faso" 1994 40.81 "c"
    "Burkina Faso" 1998 49.31 "c"
    "Burkina Faso" 2003 71.8 "c"
    "Burkina Faso" 2009 72.57 "c"
    "Burkina Faso" 2014 83.84 "c"
    "Bangladesh" 1983 80.8 "C"
    "Bangladesh" 1985 89.08 "C"
    "Bangladesh" 1988 78.44 "C"
    "Bangladesh" 1991 71.46 "c"
    "Bangladesh" 1995 86.76 "c"
    "Bangladesh" 2000 89.59 "c"
    "Bangladesh" 2005 100.01 "c"
    "Bangladesh" 2010 107.04 "c"
    "Bangladesh" 2016 117.31 "c"
    "Bulgaria" 1989 713.48 "C"
    "Bulgaria" 1992 606.6 "I"
    "Bulgaria" 1994 411.87 "C"
    "Bulgaria" 1995 381.41 "c"
    "Bulgaria" 1997 214.39 "C"
    "Bulgaria" 2001 302.72 "c"
    "Bulgaria" 2003 284.06 "c"
    "Bulgaria" 2006 315.88 "i"
    "Bulgaria" 2007 377.64 "c"
    "Bulgaria" 2007 447.09 "i"
    "Bulgaria" 2008 491.59 "i"
    "Bulgaria" 2009 512.98 "i"
    "Bulgaria" 2010 491.23 "i"
    "Bulgaria" 2011 455.16 "i"
    "Bulgaria" 2012 478.5 "i"
    "Bulgaria" 2013 534.31 "i"
    "Bulgaria" 2014 569.5 "i"
    "Bosnia and Herzegovina" 2001 373.15 "c"
    "Bosnia and Herzegovina" 2004 564.56 "c"
    "Bosnia and Herzegovina" 2007 615.12 "c"
    "Bosnia and Herzegovina" 2011 585.77 "c"
    "Bosnia and Herzegovina" 2015 567.38 "c"
    "Belarus" 1993 210.58 "I"
    "Belarus" 1995 314.39 "I"
    "Belarus" 1998 134.84 "c"
    "Belarus" 1999 143.06 "c"
    "Belarus" 2000 158.8 "c"
    "Belarus" 2001 197.13 "c"
    "Belarus" 2002 216.55 "c"
    "Belarus" 2003 223.43 "c"
    "Belarus" 2004 250.92 "c"
    "Belarus" 2005 308.52 "c"
    "Belarus" 2006 356.6 "c"
    "Belarus" 2007 409.84 "c"
    "Belarus" 2008 425.87 "c"
    "Belarus" 2009 430.11 "c"
    "Belarus" 2010 485.23 "c"
    "Belarus" 2011 497.07 "c"
    "Belarus" 2012 556.65 "c"
    "Belarus" 2013 620.56 "c"
    "Belarus" 2014 646.27 "c"
    "Belarus" 2015 611.61 "c"
    "Belarus" 2016 589.15 "c"
    "Belize" 1993 340.83 "i"
    "Belize" 1994 331.34 "i"
    "Belize" 1995 420.67 "c"
    "Belize" 1996 266.38 "i"
    "Belize" 1997 275.29 "i"
    "Belize" 1998 250.78 "i"
    "Belize" 1999 251.41 "i"
    "Bolivia" 1990 232.86 "I"
    "Bolivia" 1997 285.01 "i"
    "Bolivia" 1999 249.54 "i"
    "Bolivia" 2000 233.01 "i"
    "Bolivia" 2001 241.68 "i"
    "Bolivia" 2002 241.54 "i"
    "Bolivia" 2004 277.9 "i"
    "Bolivia" 2005 295.6 "i"
    "Bolivia" 2006 314.91 "i"
    "Bolivia" 2007 313.64 "i"
    "Bolivia" 2008 328.94 "i"
    "Bolivia" 2009 347.77 "i"
    "Bolivia" 2011 382.06 "i"
    "Bolivia" 2012 391.02 "i"
    "Bolivia" 2013 431.02 "i"
    "Bolivia" 2014 433.42 "i"
    "Bolivia" 2015 417.83 "i"
    "Bolivia" 2016 402.68 "i"
    "Brazil" 1981 246.71 "i"
    "Brazil" 1982 251.18 "i"
    "Brazil" 1983 212.25 "i"
    "Brazil" 1984 210.96 "i"
    "Brazil" 1985 223.04 "i"
    "Brazil" 1986 398.31 "i"
    "Brazil" 1987 312.75 "i"
    "Brazil" 1988 308.17 "i"
    "Brazil" 1989 358.33 "i"
    "Brazil" 1990 280.73 "i"
    "Brazil" 1992 223.53 "i"
    "Brazil" 1993 281.19 "i"
    "Brazil" 1995 376.94 "i"
    "Brazil" 1996 378.98 "i"
    "Brazil" 1997 374.62 "i"
    "Brazil" 1998 386.01 "i"
    "Brazil" 1999 362.01 "i"
    "Brazil" 2001 405.51 "i"
    "Brazil" 2002 411.77 "i"
    end
    [/CODE]

    I would like to write a command which would delete for example observations for i and I for each country if the number of observations for the sum of these two are lower than the sum of observations for "c" and "C". If the opposite holds, than the observations for c and C should naturally be deleted. I was thinking about using a loop for this maybe, but unfortunately I am not that good yet with using loops. If anyone has an idea how I could write this command, I would be very grateful.

    I thank you for your time.

    Best,

    Ryan Marapin

  • #2
    I am not completely clear what is wanted here except that I am confident that no loops are needed. You can count instances of "c" and "C" and separately of "i" and "l" by

    Code:
    egen count1 = total(lower(inc_con) == "c") , by(country)
    egen count2 = total(lower(inc_con) == "i") , by(country)

    and then the tape is in sight, except that you need a rule if the counts are the same.

    PS After two years on Statalist you should find your way to https://www.statalist.org/forums/help#spelling today!

    Comment


    • #3
      Dear Mr. Cox,

      Thank you for your help, the command works perfectly. I will take a look now at the link that you just posted, sorry if I was writing sloppy posts the past few weeks!

      Thank you once again and have a nice day.

      Best,

      Ryan Marapin

      Comment

      Working...
      X