Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to generate the number of schools a student is eligible for within his own state and year?


    I have the following dataset:

    school id score state year

    university_a 1 150 Cal 2000
    university_a 2 152 Cal 2000
    university_b 3 151 Ny 2000

    Each school enrolls individuals in each state and year. I am interested to find out the number of schools that each individual can enroll in based on his score and the cutoff score of all schools participated in the enrollment process in his province and year.

    I can solve the problem by collapsing the dataset to get the cutoffs for each school-state-year combination, and merging back to the original dataset. But the problem is that these would create over 1000 variables and the dataset becomes too large.

    Is there a solution that does not involve creating a variable for each cutoff?

  • #2
    Cross-posted at https://stackoverflow.com/questions/...istic-in-stata Please note our policy on cross-posting, which is that you are asked to tell us about it. This is explicit in the FAQ Advice, which all members are asked to read before posting.

    A bigger deal for you is that I think you're presuming that people are familiar with the procedures you're talking about and know what cutoff scores mean, and so forth. This is an international forum, so that may be wrong. I am from Britain and have no idea. More positively, many people here are based in the US, so you could be in luck. But if you don't get a positive response, I think you'll need to try harder to explain your problem.

    Comment


    • #3
      Building on Nick's advice, and looking back at your previous posts, you should review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post, noting especially sections 9-12 on how to best pose your question.

      The more you help others understand your problem, the more likely others are to be able to help you solve your problem. If you can't tell us what you've done, we can't really tell you how you could improve what you've done.

      In particular, you show a very limited selection of sample data
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str12 school byte id int score str3 state int year
      "university_a" 1 150 "Cal" 2000
      "university_a" 2 152 "Cal" 2000
      "university_b" 3 151 "Ny"  2000
      end
      where id presumably identifies a student, based on your description. You then leave it to our imagination what defines the "cutoffs for each school-state-year combination". You leave unspecified whether state identifies the location of the student or of the school: is "university_b" in NY, or is student id 3?

      Please walk through an explanation of what the results would be for your three observations, and if those are not adequate to explore all the possibilities, add some more examples, or more variables, or both, to your data.

      Or, rather than collapse, reshape wide, and merge, which seems to me to be how you are generating a variable for each cutoff, you should consider collapse, joinby, and collapse again.

      Comment

      Working...
      X