Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Leave one out (random household) mean from community average

    To calculate an average community mean basic sanitation facility access (leave-one-out -leaving a particular household under consideration) for all other households:
    I used the code:
    by cluster, sort: egen numerator = total( basic_sanitation) by cluster: egen denominator = count(basic_sanitation) gen comm_basic_san = (numerator - basic_sanitation)/(denominator - 1) However, now I need to calculate the community level mean basic_sanitation by leave-one-out mean but leaving a random household and leaving 2-3 random households. I have 28 households in a community who have/or haven't basic sanitation access. A code help will be appreciated. Thank you!

  • #2
    You don't provide example data, so this code is untested. But I believe what you want is:
    Code:
    local n_to_omit 2
    
    by cluster, sort: gen double shuffle = runiform()
    by cluster (shuffle), sort: gen byte omit = _n <= `n_to_omit'
    
    by cluster, sort: egen numerator = total(cond(!omit, basic_sanitation, .) )
    by cluster: egen denominator = count(cond(!omit, basic_sanitation, .)
    I have interpreted your request as asking for the same number of households to be omitted in each community. The code above omits 2 from each, and it is clear how to change that to 3 or any other desired number. If you want the number of households omitted to vary randomly from one community to a next, it is a bit more complicated. In that case, if you need additional advice, please use the -dataex- command and show example data when you post back. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.



    Comment


    • #3
      Thank you very much @Clyde for your usual kind and detailed response.
      As I posted above to generate community level sanitation ( leaving the particular household under consideration), I used following code.

      by cluster, sort: egen numerator = total( basic_sanitation) by cluster: egen denominator = count(basic_sanitation) gen comm_basic_san = (numerator - basic_sanitation)/(denominator - 1) However, now I need to calculate the community level mean basic_sanitation by leave-one-out mean but leaving a random household and leaving 2-3 random households. I have 28 households in a community who have/or haven't basic sanitation access

      Now I am asked to use alternative community variable - leave-one-out cluster mean using a different (random) household, and households- to see the variation in results and observe how leaving sanitation of your own household affect community average. ( do I need to leave the particular household under consideration as well while leaving random household?). A code help using both options is appreciated.

      Following is my data with variables cluster ID, household number, household sanitation access (basic/non-basic), cluster basic sanitation (leave-one-out mean), community avg. san ( community level full sanitation mean)

      input int cluster byte hh_number float(basic_sanitation cluster_basic_san community_avg_san)

      cluster hh_number basic_san cluster_basic_san community_avg_san
      1 1 1 .9583333 .96
      1 2 1 .9583333 .96
      1 3 1 .9583333 .96
      1 4 1 .9583333 .96
      1 5 1 .9583333 .96
      1 6 1 .9583333 .96
      1 7 1 .9583333 .96
      1 8 1 .9583333 .96
      1 9 1 .9583333 .96
      1 10 1 .9583333 .96
      1 11 1 .9583333 .96
      1 12 1 .9583333 .96
      1 14 1 .9583333 .96
      1 15 1 .9583333 .96
      1 16 1 .9583333 .96
      1 19 1 .9583333 .96
      1 20 1 .9583333 .96
      1 21 1 .9583333 .96
      1 22 0 1 .96
      1 23 1 .9583333 .96
      1 24 1 .9583333 .96
      1 25 1 .9583333 .96
      1 26 1 .9583333 .96
      1 27 1 .9583333 .96
      1 28 1 .9583333 .96
      2 1 1 1 1
      2 2 1 1 1
      2 3 1 1 1
      2 4 1 1 1
      2 5 1 1 1
      2 6 1 1 1
      2 8 1 1 1
      2 9 1 1 1
      2 10 1 1 1
      2 11 1 1 1
      2 12 1 1 1
      2 13 1 1 1
      2 14 1 1 1
      2 15 1 1 1
      2 16 1 1 1
      2 17 1 1 1
      2 18 1 1 1
      2 19 1 1 1
      2 20 1 1 1
      2 21 1 1 1
      2 22 1 1 1
      2 23 1 1 1
      2 24 1 1 1
      2 25 1 1 1
      2 26 1 1 1
      2 27 1 1 1
      2 28 1 1 1
      3 1 0 .8 .7692308
      3 2 1 .76 .7692308
      3 3 1 .76 .7692308
      3 4 0 .8 .7692308
      3 5 1 .76 .7692308
      3 6 1 .76 .7692308
      3 7 1 .76 .7692308
      3 8 1 .76 .7692308
      3 9 1 .76 .7692308
      3 10 0 .8 .7692308
      3 11 1 .76 .7692308
      3 12 1 .76 .7692308
      3 13 1 .76 .7692308
      3 16 1 .76 .7692308
      3 17 1 .76 .7692308
      3 18 0 .8 .7692308
      3 19 0 .8 .7692308
      3 20 1 .76 .7692308
      3 21 1 .76 .7692308
      3 22 1 .76 .7692308
      3 23 1 .76 .7692308
      3 24 0 .8 .7692308
      3 25 1 .76 .7692308
      3 26 1 .76 .7692308
      3 27 1 .76 .7692308
      3 28 1 .76 .7692308
      4 1 1 .5 .52
      4 2 1 .5 .52
      4 3 0 .5416667 .52
      4 4 1 .5 .52
      4 5 0 .5416667 .52
      4 6 1 .5 .52
      4 7 0 .5416667 .52
      4 8 1 .5 .52
      4 9 0 .5416667 .52
      4 10 0 .5416667 .52
      4 11 1 .5 .52
      4 12 0 .5416667 .52
      4 13 0 .5416667 .52
      4 14 1 .5 .52
      4 15 0 .5416667 .52
      4 17 1 .5 .52
      4 18 1 .5 .52
      4 20 0 .5416667 .52
      4 21 1 .5 .52
      4 22 0 .5416667 .52
      4 24 0 .5416667 .52
      4 25 1 .5 .52
      4 26 1 .5 .52
      4 27 0 .5416667 .52
      4 28 1 .5 .52
      5 1 1 .6190476 .6363636
      5 2 1 .6190476 .6363636
      5 4 1 .6190476 .6363636
      5 5 1 .6190476 .6363636
      5 6 1 .6190476 .6363636
      5 7 0 .6666667 .6363636
      5 9 1 .6190476 .6363636
      5 10 1 .6190476 .6363636
      5 12 1 .6190476 .6363636
      5 13 0 .6666667 .6363636
      5 14 1 .6190476 .6363636
      5 15 0 .6666667 .6363636
      5 17 0 .6666667 .6363636
      5 18 1 .6190476 .6363636
      5 19 0 .6666667 .6363636
      5 20 0 .6666667 .6363636
      5 21 1 .6190476 .6363636
      5 23 0 .6666667 .6363636
      5 24 1 .6190476 .6363636
      5 25 0 .6666667 .6363636
      5 27 1 .6190476 .6363636
      5 28 1 .6190476 .6363636
      6 1 0 .13043478 .125
      6 2 0 .13043478 .125
      6 4 0 .13043478 .125
      6 5 0 .13043478 .125
      6 6 0 .13043478 .125
      6 7 1 .08695652 .125
      6 8 0 .13043478 .125
      6 9 0 .13043478 .125
      6 10 1 .08695652 .125
      6 11 0 .13043478 .125
      6 12 1 .08695652 .125
      6 13 0 .13043478 .125
      6 14 0 .13043478 .125
      6 15 0 .13043478 .125
      6 16 0 .13043478 .125
      6 17 0 .13043478 .125
      6 18 0 .13043478 .125
      6 19 0 .13043478 .125
      6 20 0 .13043478 .125
      6 21 0 .13043478 .125
      6 23 0 .13043478 .125
      6 24 0 .13043478 .125
      6 25 0 .13043478 .125
      6 28 0 .13043478 .125
      end
      [/CODE]

      Comment


      • #4
        I'm sorry, but I don't understand what is being asked for here. Can you show how you would go about this by hand, illustrating the approach with a small cluster.

        Comment


        • #5
          Thank you Cylde for your reply. I will try to explain further for 1 cluster. Following is the calculation of leave one out from cluster average for cluster #1 from my previous calculations. This cluster has a total 25 households. The cluster_basic_san shows the leave out mean value of cluster level sanitation.

          cluster hh_number basic_sanitation cluster_basic_san denominator numerator
          1 1 1 .9583333 25 24
          1 2 1 .9583333 25 24
          1 3 1 .9583333 25 24
          1 4 1 .9583333 25 24
          1 5 1 .9583333 25 24
          1 6 1 .9583333 25 24
          1 7 1 .9583333 25 24
          1 8 1 .9583333 25 24
          1 9 1 .9583333 25 24
          1 10 1 .9583333 25 24
          1 11 1 .9583333 25 24
          1 12 1 .9583333 25 24
          1 14 1 .9583333 25 24
          1 15 1 .9583333 25 24
          1 16 1 .9583333 25 24
          1 19 1 .9583333 25 24
          1 20 1 .9583333 25 24
          1 21 1 .9583333 25 24
          1 22 0 1 25 24
          1 23 1 .9583333 25 24
          1 24 1 .9583333 25 24
          1 25 1 .9583333 25 24
          1 26 1 .9583333 25 24
          1 27 1 .9583333 25 24
          1 28 1 .9583333 25 24
          Now, instead of the particular household under consideration, it is desired to use an alternative community variable - leave-one-out cluster mean using a different (random) household/households - ( do I need to leave the particular household under consideration as well while leaving a random household?). A code help using both options is appreciated.
          Last edited by Chanda Moon; 17 Apr 2024, 20:33.

          Comment


          • #6
            The data appears to have almost no variability. Is this always true?

            Comment


            • #7
              Isn't this just the same thing as in #2, with the value of n_to_omit set to 1 instead of 2?

              If you want to leave out both the current observation and a random one:
              Code:
              local n_to_omit 1
              
              by cluster, sort: gen double shuffle = runiform()
              by cluster (shuffle), sort: gen byte omit = _n <= `n_to_omit'
              
              //    REMOVE THE RANDOM HOUSEHOLD
              by cluster, sort: egen numerator = total(cond(!omit, basic_sanitation, .) )
              by cluster: egen denominator = count(cond(!omit, basic_sanitation, .))
              
              //    REMOVE THE CURRENT HOUSEHOLD, BUT ONLY IF IT IS NOT ALREADY OMITTED
              replace numerator = numerator - basic_sanitation if !omit
              replace denominator = denominator -1 if !omit

              Comment


              • #8
                Thank you very much @Clyde for your kind response. Yes, I think this is almost the same thing as in #2 with a little modification. Thank you george for your consideration into the matter. You are right. In this cluster there is almost no variation, but it has variation in other clusters.Some I presented in #3 out of 550 clusters.

                Comment

                Working...
                X