"complex matching" on rows and collumms

Ariel Karlinsky

Join Date: Jun 2015
Posts: 491

"complex matching" on rows and collumms

01 Sep 2015, 05:32

I'm trying to do some sort of radius analysis.
I have data on schools and number of kids in it and the distances between the schools. an example table follows the explaination of what i'm trying to accomplish:

I want "kids_in_nearby_schools" variable to sum up the number of kids in each school which is "close" to the school specified in school_code. "close"'s defintion can vary, for now let's say it 25KM. so for example, as school 1 only has school 2 nearby, kids_in_nearby_schools should get 11 23 31 - the number of kids in each year in school 2. school 2 meanwhile has 2 schools nearby and so kids_in_nearby_schools should get 51 61 57 - the sum of the number of kids in schools 1 and 3, which are both less than 25KM away from school 2.

My data is formatted as follows:

school_code	year	kids	dist_to_1	dist_to_2	dist_to_3	kids_in_nearby_schools
1	1	34	0	20	40
1	2	42	0	20	40
1	3	21	0	20	40
2	1	11	20	0	20
2	2	23	20	0	20
2	3	31	20	0	20
3	1	17	40	20	0
3	2	19	40	20	0
3	3	36	40	20	0

The solution to this must be very very simple yet I can't seem to be able to get it right. iv'e tried various loops and such but to no avail. any help would be most welcome. I use stata 11.2.

Tags: None

Friedrich Huebler

Join Date: Apr 2014
Posts: 1053

01 Sep 2015, 08:24

Code:

forval school = 1/3 {
  forval year = 1/3 {
    sum kids if school_code != `school' & year == `year' & dist_to_`school' < 25, meanonly
    replace kids_in_nearby_schools = `r(sum)' if school_code == `school' & year == `year'
  }
}

For future queries, please consider using the dataex package from SSC to share your data.

Comment

Friedrich Huebler

Join Date: Apr 2014
Posts: 1053

01 Sep 2015, 16:11

The code below was generated with dataex.

Code:

clear
input float(school_code year kids dist_to_1 dist_to_2 dist_to_3 kids_in_nearby_schools)
1 1 34 0 20 40 . 
1 2 42 0 20 40 . 
1 3 21 0 20 40 . 
2 1 11 20 0 20 . 
2 2 23 20 0 20 . 
2 3 31 20 0 20 . 
3 1 17 40 20 0 . 
3 2 19 40 20 0 . 
3 3 36 40 20 0 . 
end

Let's run the code in post #2.

Code:

forval school = 1/3 {
  forval year = 1/3 {
    sum kids if school_code != `school' & year == `year' & dist_to_`school' < 25, meanonly
    replace kids_in_nearby_schools = `r(sum)' if school_code == `school' & year == `year'
  }
}

Here is the result.

Code:

. clist, noobs

school_~e       year       kids  dist_to_1  dist_to_2  dist_to_3  kids_in~s
        1          1         34          0         20         40         11
        1          2         42          0         20         40         23
        1          3         21          0         20         40         31
        2          1         11         20          0         20         51
        2          2         23         20          0         20         61
        2          3         31         20          0         20         57
        3          1         17         40         20          0         11
        3          2         19         40         20          0         23
        3          3         36         40         20          0         31

This was tested in Stata 11.2.

Comment

Ariel Karlinsky

Join Date: Jun 2015

Posts: 491
#4

02 Sep 2015, 01:58

once again statalist proves to be invaluable. I had a very simmilar idea with nested loops but not this. it worked and it works really well too. I can't thank you enough for this
Comment
Robert Picard

Join Date: Mar 2014

Posts: 1536
#5

20 Feb 2017, 11:06

See the following post for another approach to solving this problem.
Comment

Announcement

"complex matching" on rows and collumms

Comment

Comment

Comment

Comment