Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting households for which there is info on both 1 male and 1 female

    Hi,
    I have a dataset that should include information on 2 interviewees per household: a female and a male partner. However, I noticed that plenty of information has been collected regarding only one of the partners or about more members of the same household by having the same gender (i.e. 2 females, or 3 males).

    I would like to count the number of households for which we have information on both 1 male and 1 female.

    An example of the data looks like this (with plenty more observations):
    ID Gender
    1. male 2. female Total
    1 0 1 1
    2 2 0 2
    4 0 1 1
    5 0 1 1
    6 2 0 2
    7 0 3 3
    11 0 1 1
    12 0 2 2
    15 0 2 2
    17 0 1 1
    21 1 1 2
    In this small extract of data, only Household ID 21 has info on 1 male and 1 female. How can I derive an exact count of the Households that are like ID 21 (for which there is info on 1 male and 1 female)?

    Thank you very much.


  • #2
    Please provide an extract of your dataset with the dataex command (see FAQ 12.2 https://www.statalist.org/forums/help)
    Stata/MP 14.1 (64-bit x86-64)
    Revision 19 May 2016
    Win 8.1

    Comment


    • #3
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int sectionaa_new_householda1_househ byte sectionaa_new_householda8_gender
        851 1
       1101 1
        541 1
      10361 1
        667 1
        624 1
        261 1
        811 1
       1017 1
        821 1
      10654 1
        296 1
       1101 1
        751 1
        386 1
        474 1
        917 1
        743 1
        213 1
       1017 1
        292 1
      10693 1
        131 1
       1061 1
        295 1
        296 1
        472 1
        764 1
         72 1
        337 1
        913 1
        386 1
          2 1
        312 1
        624 1
      10351 1
        971 1
         36 1
        281 1
        322 1
        855 1
        951 1
        337 1
        161 1
          2 1
          6 1
        963 1
        967 1
        747 1
        457 1
      10141 1
      10681 1
        163 1
        951 1
        366 1
        723 1
        603 1
        661 1
        476 1
        941 1
        672 1
       1127 1
        724 1
      10654 1
      10613 1
         51 1
      10547 1
      10161 1
        617 1
        291 1
        781 1
        161 1
        192 1
        917 1
        473 1
        263 1
        516 1
      10613 1
        312 1
       1147 1
         43 1
        384 1
       1207 1
         31 1
      10151 1
        874 1
      10391 1
       1201 1
         81 1
        747 1
        501 1
        184 1
        554 1
      10361 1
        805 1
        452 1
      10547 1
        141 1
         71 1
        443 1
      end
      label values sectionaa_new_householda8_gender gender
      label def gender 1 "1. male", modify

      Comment


      • #4
        So there is no variation on gender here, but I'm assuming there is variation later in the dataset. I added a little variation by putting a few women in the dataset (this is just for illustration with your sample data, you would not do this in your own):

        Code:
        sort sectionaa_new_householda1_househ
        replace sectionaa_new_householda8_gender=2 in 96
        replace sectionaa_new_householda8_gender=2 in 2
        replace sectionaa_new_householda8_gender=2 in 82
        You can then count the number of women and men in each household:
        Code:
        bysort sectionaa_new_householda1_househ: egen count_men=total( sectionaa_new_householda8_gender==1)
        bysort sectionaa_new_householda1_househ: egen count_women=total( sectionaa_new_householda8_gender==2)
        If all you want to know is the number of households with only one of each:

        Code:
        count if count_men==1 & count_women==1
        Then just divide that number by 2 (since you know there will be exactly 2 people in each household).

        This solution assumes that you do not have any missing values on your gender variable such that you might have 1 male, 1 female, 1 unknown.
        Stata/MP 14.1 (64-bit x86-64)
        Revision 19 May 2016
        Win 8.1

        Comment


        • #5
          Thank you very much. It worked!
          Last edited by Allie Sun; 27 Jul 2018, 01:55.

          Comment


          • #6
            Would you be able to tell me how to generate a variable that just tells us the number of households for which we have info on the couples. Meaning, a variable that summarizes the info we just derived - including the divided by two part.

            I need to this to then be able to graph the number of households with couples interviewed per cluster. If I do so right now using the following command:

            graph bar (count) sectionaa_new_householda1_househ if count_men==1 & count_women==1, by(sectionaa_new_householda3_cluste)

            I clearly get double the number.

            I need a variable that already includes the count divided by 2.

            I tried egen and count, but it keeps giving me error messages.

            Thank you very much - your help is truly appreciated; as I am new to this.

            Comment


            • #7
              The values for count_men & count_women are constant within households. We can easily create a new variable that is 1 if the number of men is 1 and the number of women is 1--this will also be constant within households:

              Code:
              gen only_mf=0
              replace only_mf=1 if count_men==1 & count_women==1
              Now we still have the problem that we have multiple observations per household. So if we want characteristics of the household, we only want to observe/graph a single observation of that household. The command -egen- has a function -tag( )- that selects one observation per specified group:

              Code:
              egen tag=tag(sectionaa_new_householda1_househ)
              I don't have your variable sectionaa_new_householda3_cluste, but I can create a made up group variable:
              Code:
              gen group=0
              replace group=1 in 34/66
              replace group=2 in 67/100
              And graph using the following command:
              Code:
              graph bar (count) sectionaa_new_householda1_househ if only_mf==1 & tag==1, over(group) allcategories


              Stata/MP 14.1 (64-bit x86-64)
              Revision 19 May 2016
              Win 8.1

              Comment


              • #8
                Very thankful - it worked well.

                Your help is truly appreciated.

                Comment


                • #9
                  May I ask your support again:

                  I would like to draw a bar graph illustrating on the y axes the total count and on the x axis: the number of households and the gender per household. Meaning, I would like to visualize each household number and within that show how many females and males we have.

                  I have used this command:

                  graph bar (count), over(sectionaa_new_householda8_gender) over(sectionaa_new_householda1_househ)

                  However, having a total of 871 observations and 527 households, I can 't see much from the graph.

                  I fear that also in this case, the household numbers are repeated more than once - while I would like just one household number and for each to be able to see how many males and females we have info on.

                  I suppose that the tag function would help me but I am not sure how to apply it in this case; or I might just need to work on the graph so that 527 households fit clearly in it.

                  Thanks again for all your support.
                  Last edited by Allie Sun; 30 Jul 2018, 03:27.

                  Comment


                  • #10
                    527 households identified explicitly on a graph??? I can't imagine a design that would work unless your graph size is a few metres.

                    The data example in #3 -- as Carole tactfully pointed out -- is not helpful here. Modifying the data example in #1 I can suggest one kind of graph that should work: it shows the joint distribution of number of females and number of males.

                    Note that

                    1. It's not clear why you have more observations than household if your data are like #1.

                    2. You need to install tabplot before you can use it.

                    Code:
                    search tabplot, sj
                    to get a clickable download link. At the time of writing you need files from gr0066_1

                    SJ-17-3 gr0066_1 . . . . . . . . . . . . . . . . Software update for tabplot
                    (help tabplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
                    Q3/17 SJ 17(3):779
                    added options for reversing axis scales; improved handling of
                    axis labels containing quotation marks

                    SJ-16-2 gr0066 . . . . . . Speaking Stata: Multiple bar charts in table form
                    (help tabplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
                    Q2/16 SJ 16(2):491--510
                    provides multiple bar charts in table form representing
                    contingency tables for one, two, or three categorical variables


                    Code:
                    
                    clear 
                    input ID    male    female    Total
                    1    0    1    1
                    2    2    0    2
                    4    0    1    1
                    5    0    1    1
                    6    2    0    2
                    7    0    3    3
                    11    0    1    1
                    12    0    2    2
                    15    0    2    2
                    17    0    1    1
                    21    1    1    2
                    end 
                    
                    tabplot male female, showval yasis xasis bfcolor(none)
                    Click image for larger version

Name:	malefemale.png
Views:	1
Size:	15.1 KB
ID:	1455677

                    Comment

                    Working...
                    X