Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate variable with percentage of another variable

    Hello everyone,

    I need to create a variable that measures neighborhood effects.

    Let me explain: I have a dataset consisting of students from different classes / school and a variable that measures the retention rate. I want to create a variable that groups the retention rate percentage of the class without considering the individual student.

    Any idea on how to do this would be very much appreciated.

    Thank you
    MD

  • #2
    ?

    egen retrate_class = mean(retrate), by(class)

    Comment


    • #3
      please read the FAQ to learn how to ask questions that are capable of being answered and how to post data examples; at a guess, the following FAQ may help: https://www.stata.com/support/faqs/d...ng-properties/

      Comment


      • #4
        Originally posted by George Ford View Post
        ?

        egen retrate_class = mean(retrate), by(class)
        Hi,

        I'd tried that before but it doesn't make a difference depending on the individual considered.

        Comment


        • #5
          https://www.stata.com/support/faqs/d...ng-properties/

          Comment


          • #6

            I think this is quite easy to do, but without providing a useful data example, hardly anyone will be willing to help. Read again #3!

            Comment


            • #7
              I've finally found the solution. Thank to all

              I put the code that I used for someone who looking for:


              bysort classes: egen tot_students = count(classes)
              bysort classes: egen total_retention = total(retention)
              gen pc_retent = (total_retention - retention) / (tot_students - 1)

              Comment


              • #8
                The essence of the code in #7 was already given in the FAQ linked in #5, As said, this is a Frequently Asked Question.

                Note that you are assuming thatretentionis never missing. If it is ever missing then your variable (which despite your name looks like a proportion to me, not a percentage) will be miscalculated for those individuals. Even if retention is missing, it's still possible to calculate the retention proportion for the other students.

                So, more general code would be

                Code:
                gen pr_retent = (total_retention - cond(missing(retention), 0, retention)) / (tot_students - !missing(retention))

                Comment

                Working...
                X