Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combining observations under specific conditions

    Dear stata-users,

    I'm trying to combine observations under some conditions. Specifically, I have different regions (ror, from 101 to 1604) that I would like to merge with neighboring regions. Therefore, I have assigned the merger regions of each individual region via the variables N101-N1604, in which a dummy variable indicates the neighboring regions. I only want to merge regions that have a population with less than 1600000 inhabitants, and I'd like to merge them with a neighboring region such that the merged region (consisting of the 2 original regions) comprises of more than 1600000 inhabitants, but preferabally as little as possible more than 1600000. If that is not possible with two regions, I'd like to merge them with another region. I am aware that I have to write a loop for this problem and I was experimenting with the collapse-order, but unfortunately I am clueless on how to implement these very specific conditions for merging the regions. If someone has an idea or some hints on how to restructure my data, that would be much appreciated.



    Code:
    * Example generated by -dataex-. To    install: ssc install dataex
    clear
    input int ror double pop float(N101    N102 N103 N104 N105 N201 N301    N302)
    101  724185 1 1 1 1 . . . .
    102  446741 1 1 . . 1 . . .
    103  415827 1 . 1 1 . . . .
    104 1007587 1 . 1 1 1 1 . .
    105  264374 1 1 . 1 1 . . .
    201 1787408 . . . 1 . 1 . .
    301 1133730 . . . . . . 1 .
    302  538523 . . . . . . . 1
    303  401367 . . . . . . . 1
    304  455150 . . . . . . . .
    305  536093 . . . . . . 1 .
    306  611429 . . . 1 1 1 . 1
    307 1421319 . . . . . . 1 1
    308  425336 . . . . . . 1 .
    309  323978 . . . . . . 1 .
    310  578607 . . . . . . . 1
    311  658348 . . . . . . . 1
    312  638509 . . . . . . . .
    313  318235 . . . . . . 1 .
    401  557464 . . . . . . . 1
    501 1260442 . . . . . . . .
    502  566757 . . . . . . . .
    503 1609654 . . . . . . . .
    504 1451762 . . . . . . . .
    505  915022 . . . . . . . .
    506 1161613 . . . . . . . .
    507 2227068 . . . . . . . .
    508 2946555 . . . . . . . .
    509  995318 . . . . . . . .
    510 2246907 . . . . . . . .
    511 1618911 . . . . . . . .
    512  448342 . . . . . . . .
    513  417165 . . . . . . . .
    601 1040091 . . . . . . . .
    602  872414 . . . . . . . .
    603  341298 . . . . . . . .
    604 2844273 . . . . . . . .
    605 1078096 . . . . . . . .
    701 1250872 . . . . . . . .
    702  865340 . . . . . . . .
    703  883304 . . . . . . . .
    704  532715 . . . . . . . .
    705  520572 . . . . . . . .
    801  622269 . . . . . . . .
    802  508759 . . . . . . . .
    803  890931 . . . . . . . .
    804  674857 . . . . . . . .
    805 1025230 . . . . . . . .
    806  692545 . . . . . . . .
    807  589905 . . . . . . . .
    808  443177 . . . . . . . .
    809  483754 . . . . . . . .
    810 2735425 . . . . . . . .
    811 1065924 . . . . . . . .
    812 1146842 . . . . . . . .
    901  480488 . . . . . . . .
    902  888810 . . . . . . . .
    903  371127 . . . . . . . .
    904  476722 . . . . . . . .
    905  656044 . . . . . . . .
    906 1324120 . . . . . . . .
    907  480025 . . . . . . . .
    908  437110 . . . . . . . .
    909  434192 . . . . . . . .
    910 2848906 . . . . . . . .
    911  463196 . . . . . . . .
    912  596162 . . . . . . . .
    913  443507 . . . . . . . .
    914  500740 . . . . . . . .
    915  710564 . . . . . . . .
    916  816506 . . . . . . . .
    917  414566 . . . . . . . .
    918  500729 . . . . . . . .
    1001  995597 . . . . . . . .
    1101 3520031 . . . . . . . .
    1201  772018 . . . . . . . .
    1202  598973 . . . . . . . .
    1203  431203 . . . . . . . .
    1204  384207 . . . . . . . .
    1205  298425 . . . . . . . .
    1301  262517 . . . . . . . .
    1302  419484 . . . . . . . .
    1303  463178 . . . . . . . .
    1304  467183 . . 1 1 . . . .
    1401 1036481 . . . . . . . .
    1402  566273 . . . . . . . .
    1403 1465612 . . . . . . . .
    1404 1016485 . . . . . . . .
    1501  201426 . . . . . . 1 .
    1502  376183 . . . . . . . .
    1503  748911 . . . . . . . .
    1504  918950 . . . . . . 1 .
    1601  673427 . . . . . . . .
    1602  369063 . . . . . . 1 .
    1603  677409 . . . . . . . .
    1604  450815 . . . . . . . .
    end


  • #2
    It is not clear to me what the end result you are looking for but here is a guess using just first five observations:

    Code:
    tempfile tmp1 tmp2
    
    keep if ror < 200
    drop N201-N302
    save `tmp2'
    
    preserve
    keep ror pop
    rename ror _j
    rename pop pop2
    save `tmp1' , replace
    restore
    
    reshape long N, i(ror)
    merge m:1 _j using `tmp1'
    replace N = . if pop == pop2
    sort ror _j
    by ror:gen neighbor_pop = sum(pop2) if N ==1
    replace nei = . if nei < 1600000
    keep if ne !=.
    keep ror neig
    
    merge 1:1  ror using `tmp2'
    drop _m
    sort ror
    l

    Comment


    • #3
      Or if with group

      gen POP=1 if pop>1600000
      egen ge = group(POP N101-N302), miss label
      tab ge


      group(POP N101 |
      N102 N103 N104 |
      N105 N201 N301 |
      N302) | Freq. Percent Cum.
      ------------------+-----------------------------------
      1 . . . 1 . 1 . . | 1 1.04 1.04
      1 . . . . . . . . | 9 9.38 10.42
      . 1 1 1 1 . . . . | 1 1.04 11.46
      . 1 1 . 1 1 . . . | 1 1.04 12.50
      . 1 1 . . 1 . . . | 1 1.04 13.54
      . 1 . 1 1 1 1 . . | 1 1.04 14.58
      . 1 . 1 1 . . . . | 1 1.04 15.62
      . . . 1 1 . . . . | 1 1.04 16.67
      . . . . 1 1 1 . 1 | 1 1.04 17.71
      . . . . . . . 1 1 | 1 1.04 18.75
      . . . . . . . 1 . | 8 8.33 27.08
      . . . . . . . . 1 | 5 5.21 32.29
      . . . . . . . . . | 65 67.71 100.00
      ------------------+-----------------------------------
      Total | 96 100.00

      Comment


      • #4
        Thank you both! That is very helpful.
        I will try your ideas and eventually get back with further questions.

        Comment

        Working...
        X