Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • "complex matching" on rows and collumms

    I'm trying to do some sort of radius analysis.
    I have data on schools and number of kids in it and the distances between the schools. an example table follows the explaination of what i'm trying to accomplish:

    I want "kids_in_nearby_schools" variable to sum up the number of kids in each school which is "close" to the school specified in school_code. "close"'s defintion can vary, for now let's say it 25KM. so for example, as school 1 only has school 2 nearby, kids_in_nearby_schools should get 11 23 31 - the number of kids in each year in school 2. school 2 meanwhile has 2 schools nearby and so kids_in_nearby_schools should get 51 61 57 - the sum of the number of kids in schools 1 and 3, which are both less than 25KM away from school 2.


    My data is formatted as follows:
    school_code year kids dist_to_1 dist_to_2 dist_to_3 kids_in_nearby_schools
    1 1 34 0 20 40
    1 2 42 0 20 40
    1 3 21 0 20 40
    2 1 11 20 0 20
    2 2 23 20 0 20
    2 3 31 20 0 20
    3 1 17 40 20 0
    3 2 19 40 20 0
    3 3 36 40 20 0


    The solution to this must be very very simple yet I can't seem to be able to get it right. iv'e tried various loops and such but to no avail. any help would be most welcome. I use stata 11.2.

  • #2
    Code:
    forval school = 1/3 {
      forval year = 1/3 {
        sum kids if school_code != `school' & year == `year' & dist_to_`school' < 25, meanonly
        replace kids_in_nearby_schools = `r(sum)' if school_code == `school' & year == `year'
      }
    }
    For future queries, please consider using the dataex package from SSC to share your data.

    Comment


    • #3
      The code below was generated with dataex.
      Code:
      clear
      input float(school_code year kids dist_to_1 dist_to_2 dist_to_3 kids_in_nearby_schools)
      1 1 34 0 20 40 . 
      1 2 42 0 20 40 . 
      1 3 21 0 20 40 . 
      2 1 11 20 0 20 . 
      2 2 23 20 0 20 . 
      2 3 31 20 0 20 . 
      3 1 17 40 20 0 . 
      3 2 19 40 20 0 . 
      3 3 36 40 20 0 . 
      end
      Let's run the code in post #2.
      Code:
      forval school = 1/3 {
        forval year = 1/3 {
          sum kids if school_code != `school' & year == `year' & dist_to_`school' < 25, meanonly
          replace kids_in_nearby_schools = `r(sum)' if school_code == `school' & year == `year'
        }
      }
      Here is the result.
      Code:
      . clist, noobs
      
      school_~e       year       kids  dist_to_1  dist_to_2  dist_to_3  kids_in~s
              1          1         34          0         20         40         11
              1          2         42          0         20         40         23
              1          3         21          0         20         40         31
              2          1         11         20          0         20         51
              2          2         23         20          0         20         61
              2          3         31         20          0         20         57
              3          1         17         40         20          0         11
              3          2         19         40         20          0         23
              3          3         36         40         20          0         31
      This was tested in Stata 11.2.

      Comment


      • #4
        once again statalist proves to be invaluable. I had a very simmilar idea with nested loops but not this. it worked and it works really well too. I can't thank you enough for this

        Comment


        • #5
          See the following post for another approach to solving this problem.

          Comment

          Working...
          X