Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dealing with 1/0 values in STATA

    Hello all,

    I have a peculiar problem. I am using a women-level dataset that records the birth history of women year-wise. The dataset consists of a unique id(caseid), the year in which women gave birth(b4), sex of the birth( b2, where 1 stands for male and 2 stands for female). I am trying to calculate the sex ratio at birth for each woman in each year. I define the sex ratio at birth as total female births/total male births. Now in years, in which a woman gave birth to only 1 female and 0 male, the sex ratio is being calculated as 1/0 which STATA takes as missing values. I have too many missing values which is not working out in my favor. Please see the data below.

    One option is not to use the sex ratio at birth, and instead use the proportion of females to total births. However, I do not want to do this because of many reasons one of which is that internationally the sex ratio at birth is used as a variable measuring gender discrimination and not the proportion of females to total births.

    Can anyone suggest a way around this? I will be very grateful.

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str15 caseid int b2 byte b4 float(total_females total_males sexratio)
    "  0100100110 02" 2015 2 1 0 .
    "  0100100110 02" 2018 1 2 1 2
    "  0100100110 02" 2018 2 2 1 2
    "  0100100110 02" 2018 2 2 1 2
    "  0100100114 02" 2011 2 1 0 .
    "  0100100114 02" 2013 1 0 1 0
    "  0100100117 05" 1992 2 1 0 .
    "  0100100117 05" 1994 1 0 1 0
    "  0100100117 05" 1996 1 0 1 0
    "  0100100117 05" 2001 2 1 0 .
    "  0100100122 02" 2006 2 1 0 .
    "  0100100122 02" 2008 1 0 1 0
    "  0100100122 02" 2010 1 0 1 0
    "  0100100127 02" 2017 1 0 1 0
    "  0100100136 02" 2009 1 0 1 0
    "  0100100136 02" 2011 2 1 0 .
    "  0100100136 02" 2013 2 1 0 .
    "  0100100145 04" 2012 1 0 1 0
    "  0100100145 04" 2013 2 1 0 .
    "  0100100145 04" 2017 1 0 1 0
    "  0100100146 02" 2012 1 0 1 0
    "  0100100146 02" 2014 2 1 0 .
    "  0100100146 02" 2016 1 0 1 0
    "  0100100150 02" 2011 2 1 0 .
    "  0100100150 02" 2013 1 0 1 0
    "  0100100150 02" 2014 2 1 0 .
    "  0100100156 02" 2018 2 1 0 .
    "  0100100159 02" 2013 1 0 1 0
    "  0100100164 02" 2003 2 1 0 .
    "  0100100164 02" 2005 2 1 0 .
    "  0100100164 02" 2010 2 1 0 .
    "  0100100164 02" 2016 2 1 0 .
    "  0100100185 02" 2018 2 1 0 .
    "  0100100187 02" 2005 1 0 1 0
    "  0100100187 02" 2008 1 0 1 0
    "  0100100187 02" 2010 2 1 0 .
    "  0100100195 02" 2019 2 1 1 1
    "  0100100195 02" 2019 1 1 1 1
    "  0100100201 02" 1992 1 0 1 0
    "  0100100201 02" 1993 2 1 0 .
    "  0100100201 02" 1994 2 1 0 .
    "  0100100201 02" 1996 1 0 1 0
    "  0100100201 04" 2017 1 0 1 0
    "  0100100214 02" 2013 2 1 0 .
    "  0100100214 02" 2015 1 0 1 0
    "  0100100214 02" 2018 2 1 0 .
    "  0100100224 04" 2006 2 1 0 .
    "  0100100224 04" 2008 1 0 1 0
    "  0100100224 04" 2010 1 0 1 0
    "  0100100224 04" 2012 2 1 0 .
    "  0100100234 04" 2012 1 0 1 0
    "  0100100234 04" 2015 2 1 0 .
    "  0100100244 04" 2012 2 1 0 .
    "  0100100244 04" 2016 1 0 1 0
    "  0100100251 04" 2007 1 0 1 0
    "  0100100251 04" 2009 1 0 1 0
    "  0100100251 04" 2011 2 1 0 .
    "  0100100251 04" 2013 2 1 0 .
    "  0100100254 02" 2013 2 1 0 .
    "  0100100254 02" 2017 1 0 1 0
    "  0100100265 02" 2008 2 1 0 .
    "  0100100269 04" 2001 1 0 1 0
    "  0100100269 04" 2003 2 1 0 .
    "  0100100269 04" 2012 1 0 1 0
    "  0100100270 02" 2009 2 1 0 .
    "  0100100270 02" 2015 2 1 0 .
    "  0100100270 02" 2018 1 0 1 0
    "  0100100271 02" 2011 1 0 1 0
    "  0100100271 02" 2014 2 1 0 .
    "  0100100276 04" 2016 1 0 1 0
    "  0100100279 02" 1991 1 0 1 0
    "  0100100279 04" 2017 2 1 0 .
    "  0100100279 04" 2019 1 0 1 0
    "  0100100284 02" 1996 2 1 0 .
    "  0100100284 02" 1999 1 0 1 0
    "  0100100284 02" 2002 1 0 1 0
    "  0100100288 02" 1995 1 0 1 0
    "  0100100288 02" 1998 2 1 0 .
    "  0100100288 02" 2000 2 1 0 .
    "  0100100291 02" 1997 2 1 0 .
    "  0100100291 02" 1999 2 1 0 .
    "  0100100291 02" 2001 1 0 1 0
    "  0100100291 02" 2004 1 0 1 0
    "  0100100305 02" 2006 2 1 0 .
    "  0100100305 02" 2009 2 1 0 .
    "  0100100307 04" 2018 1 0 1 0
    "  0100100312 02" 2012 1 0 1 0
    "  0100100312 02" 2018 2 1 0 .
    "  0100100316 04" 2016 1 0 1 0
    "  0100100319 02" 2007 1 0 1 0
    "  0100100319 02" 2009 2 1 0 .
    "  0100100319 02" 2011 1 0 1 0
    "  0100100325 02" 2008 2 1 0 .
    "  0100100325 02" 2010 1 0 1 0
    "  0100100332 02" 2019 1 0 1 0
    "  0100100335 02" 2004 2 1 0 .
    "  0100100335 02" 2006 1 0 1 0
    "  0100100335 02" 2008 1 0 1 0
    "  0100100336 02" 1997 1 0 1 0
    "  0100100336 02" 1999 2 1 0 .
    end
    label values b4 B4
    label def B4 1 "male", modify
    label def B4 2 "female", modify


  • #2
    There is no way round this that I can see. It is risky to pontificate on the topic without specific expertise in medical statistics, but the point seems mostly statistical common sense.

    Sex ratio is as you say widely used as a measure and that makes perfect sense whenever there are some females and some males.

    But for mothers in years, it's worse than you say: your decision to stick with sex ratio commits you to handling other possibilities such as 2/0, 3/0 and so forth.

    But why do you want to do this? Again, sex ratio makes sense for social science or epidemiological purposes for communities at various scales from say villages to nations and for snapshots (population at 1 July 2023) or time intervals (births in 2022). It doesn't make much sense at (individual, year) level. Setting aside multiple births, and different births in the same year, the two most common situations are, I presume,

    1 birth with 1 son

    1 birth with 1 daughter

    -- so your choice of sex ratio as a measure commits you to a measure with two levels that are both common, zero and indeterminate. You need something else, not a work-around.

    Please note https://www.statalist.org/forums/help#spelling

    Comment


    • #3
      Thank you for your very prompt and insightful response, Nick! The note on spelling error noted!

      Comment

      Working...
      X