Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Filling in values for neighboring districts

    Hi all,

    I am trying to create a variable that reports the average level of an indicator (X) across a district's neighboring districts. I have my master data set, which includes an id (unique) for each district and corresponding X value. The ids are not unique (despite the var name) I have a 2nd data set that has the same id (also not unique), another classifier for the district (objectid), and a list of neighboring districts based on this second classifier. Examples of the two data sets are below (the first is the master and the 2nd is the neighbors' information).
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float uniqueid double corr17
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1001                 0
    1003                 .
    1003                 .
    1003                 .
    1003                 .
    1003                 .
    1003                 .
    1005                 .
    1005                 .
    1005                 .
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    2001 .9759855586256593
    end


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int objectid str5 uniqueid int nbr_objectid
     1 "1002"    8
     1 "1002"    9
     1 "1002" 1945
     1 "1002"    6
     1 "1002"  509
     1 "1002" 1939
     1 "1002"    2
     1 "1002"    7
     2 "1011"    8
     2 "1011"    7
     2 "1011"    1
     2 "1011"    5
     3 "1007"    6
     3 "1007" 1947
     3 "1007"   11
     3 "1007"    4
     3 "1007" 1929
     3 "1007" 1896
     3 "1007"    8
     4 "1008"   10
     4 "1008"    3
     4 "1008"    8
     4 "1008" 1942
     4 "1008"    5
     4 "1008" 1947
     5 "1005"   10
     5 "1005"    4
     5 "1005"    7
     5 "1005"    8
     5 "1005"    2
     6 "1009"    3
     6 "1009"    8
     6 "1009" 1939
     6 "1009" 1929
     6 "1009"    1
     7 "1001"   10
     7 "1001"    1
     7 "1001"  558
     7 "1001"    2
     7 "1001"  588
     7 "1001"    9
     7 "1001"    5
     7 "1001"  531
     7 "1001"  524
     8 "1006"    1
     8 "1006"    2
     8 "1006"    3
     8 "1006"    5
     8 "1006"    4
     8 "1006"    6
     9 "1010"  558
     9 "1010"    1
     9 "1010"    7
     9 "1010"  509
    10 "1003" 1921
    10 "1003"    7
    10 "1003"    5
    10 "1003" 1899
    10 "1003" 1942
    10 "1003"    4
    10 "1003"  524
    11 "1004" 1896
    11 "1004" 1929
    11 "1004"    3
    12 "7023"   70
    12 "7023"   72
    12 "7023"   27
    12 "7023"   42
    12 "7023"  110
    12 "7023"   80
    12 "7023"   78
    12 "7023"   64
    13 "7087"   82
    13 "7087"  118
    14 "7031"   31
    14 "7031"   85
    14 "7031"   68
    14 "7031"   88
    14 "7031"   55
    14 "7031"  125
    14 "7031"  121
    14 "7031"   62
    15 "7097"   21
    15 "7097"   40
    15 "7097"  780
    15 "7097"   86
    15 "7097"   98
    16 "7004"   55
    16 "7004"   58
    16 "7004"  122
    16 "7004"   87
    17 "7090"  107
    17 "7090"  117
    17 "7090"  113
    17 "7090"   28
    18 "7050" 1676
    18 "7050"   31
    19 "7099"   97
    19 "7099"  115
    19 "7099"  119
    end
    The end goal is to create a variable that reports the average value of X by uniqueid based on its neighbors' values of X. Is there a way of essentially replacing the neighboring id values with that same district's uniqueid (such that I eliminate the objectid identifier)- and then perform a similar replacement where I can fill in the neighbors' X value from the first file using its uniqueid?

    Thank you.

  • #2
    I worked on a solution for a while, but I was stymied by some difficulties I had with with your example data and description.

    1) You don't include a variable named X, but it's apparently central to your example. Can you explain?
    2) Your neighbor data set includes almost no observations whose uniqueid values match with what you call your "master" data. Is that what you intend?(Note, also, as a small point, that you make uniqueid numeric in one data set and string in another. This can be solved, of course.)
    3) Did you intentionally include multiple *identical* entries in your "master" data?
    4) What you said might imply that each district has the same X value, but perhaps that was not what you intended. Can you clarify?

    Comment

    Working...
    X