Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating Proportions

    I have village level (v001) and household level (v002) data. I want to calculate the proportion of households in a village which have bank accounts.

    I have 1700 villages and the number of households in a village vary from between 1 to 10 but in total I have 7, 540 households.
    Bank account is a dummy variable which is one when the household has a bank account and 0 otherwise.

    Is there a way of doing this in stata?
    My other option is to do this manually but I think it will be taxing because I have many values.



    Thank you.

  • #2
    Assuming that the bank account dummy takes the value 1 if an individual has a bank account

    Code:
    bys v001: gen proportion= sum(bank_account)/_N
    bys v001: replace proportion= proportion[_N]
    If you have missing values for the bank account dummy and want to disregard them in the calculation

    Code:
    gen nmissing= cond(missing( bank_account ), 0, 1)
    bys v001:gen proportion= sum(bank_account)/sum(nmissing)
    bys v001: replace proportion= proportion[_N]
    Last edited by Andrew Musau; 06 Aug 2016, 06:23.

    Comment


    • #3
      Thank you for the response.
      The code has so helpful.

      But I have a quick question:
      Since I have duplicate households in some instances in the same village, if I don't drop the duplicates to remain with only one unique household among the households in a village, will my proportions still be viable.

      Code:
      village number    household number
      3    7
      3    7
      3    39
      3    39
      3    39
      3    47
      3    56
      3    80
      3    80
      3    80
      4    25
      4    25
      4    68
      8    52
      8    115
      9    52
      10    75
      10    75
      10    75
      PS. I have multiple households in a village in some instances as I am using survey data and the question asked that was used to construct the outcome variable was addressed to different individuals in the same household.

      If I use the command below, my sample size reduces considerably from 7540 to about 147, that is why I am bit wary of dropping the duplicate households.
      Code:
      duplicates drop household number

      Comment


      • #4
        Since you want a proportion by household, then you need to establish a rule of what to do in a household where not all members have bank accounts. For example, you can say if at least one household member has a bank account, then the household has a bank account and the dummy for that particular household takes a value of one. With some technique of counting a household just once, you need not delete any households.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input float(village_number household_number bank_account)
         3   7 1
         3   7 0
         3  39 1
         3  39 1
         3  39 .
         3  47 0
         3  56 0
         3  80 0
         3  80 1
         3  80 .
         4  25 1
         4  25 1
         4  68 0
         8  52 0
         8 115 0
         9  52 .
        10  75 1
        10  75 1
        10  75 1
        end
        Code:
        * 1/0 dummy, MAX replaces all observations of a particular household with 1 if at least one member has bank_accout==1
        bys village_number household_number: egen bank_account2= max(bank_account)
        *Need to sort because missing values are last in the order
        sort village_number household_number bank_account2
        *Generate a variable for the first observation in a particular household
        by village_number household_number: gen first= _n==1
        replace bank_account2=0 if first==0
        *Get proportion as with code #2
        bys village_number :gen proportion= sum(bank_account2)/sum(first)
        bys village_number: replace proportion= proportion[_N]
        list, sepby(village_number)
        Last edited by Andrew Musau; 06 Aug 2016, 12:37.

        Comment

        Working...
        X