Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Picking up a substring under multiple condition

    Hi all,

    my data look as follows:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long(docdb_family_id cited_docdb_family_id) int distance long id_mas str71 ctry_cited_docdb float id_us str71 ctry_docdb
     569328 27096884 9999 17874051 "['US']"                                                                    789757 "['AT' 'DE' 'WO' 'US' 'EP']"          
     569328 26261826 9999 16042396 "['DK' 'GB' 'NL' 'ES' 'BR' 'NZ' 'SE' 'NO' 'FR' 'IT' 'CA' 'DE' 'JP' 'AU']"   701604 "['AT' 'DE' 'WO' 'US' 'EP']"          
     569328 27079105 9999 17816032 "['US' 'CA' 'GB' 'FR' 'JP' 'DE']"                                           786694 "['AT' 'DE' 'WO' 'US' 'EP']"          
     569328  6031690 9999   956513 "['DE' 'ES' 'GR' 'EP' 'BR' 'JP' 'US' 'DK']"                                  30095 "['AT' 'DE' 'WO' 'US' 'EP']"          
     569328  6043910 9999   968713 "['DE' 'AT' 'EP' 'JP']"                                                      30793 "['AT' 'DE' 'WO' 'US' 'EP']"          
     569328 26405611 9999 16254249 "['US']"                                                                    711842 "['AT' 'DE' 'WO' 'US' 'EP']"          
     569328 22591458 9999  8534858 "['US']"                                                                    385586 "['AT' 'DE' 'WO' 'US' 'EP']"          
     569328 10455451 9999  2959310 "['DK' 'DE' 'CA' 'US' 'JP' 'EP' 'IE' 'GB']"                                 142607 "['AT' 'DE' 'WO' 'US' 'EP']"          
     569328 24627241 9999 13029704 "['GB' 'JP' 'DE' 'CA' 'FR' 'US']"                                           561184 "['AT' 'DE' 'WO' 'US' 'EP']"          
     569328 13428248 9999  4054801 "['US' 'DE' 'JP']"                                                          196683 "['AT' 'DE' 'WO' 'US' 'EP']"          
     569328 24750476 9999 13316532 "['NL' 'AU' 'ES' 'IT' 'DE' 'US' 'FR' 'CA' 'JP' 'GB']"                       573194 "['AT' 'DE' 'WO' 'US' 'EP']"          
     574660  5956195 9999   890245 "['SE' 'FR' 'NL' 'DE' 'IT' 'CH' 'JP' 'GB' 'US']"                             26518 "['JP' 'AU' 'AT' 'DE' 'WO' 'US' 'EP']"
     574660 25115774 9999 14190395 "['US']"                                                                    610085 "['JP' 'AU' 'AT' 'DE' 'WO' 'US' 'EP']"
    1187498 11183886 9999  3334542 `""['IS' 'DK' 'CA' 'NL' 'SE' 'AR' 'FI' 'GB' 'IT' 'NO' 'BE' 'DE' 'US' 'FR'"' 162155 "['DE' 'RU' 'US' 'WO']"               
    1226468  6279285    2  1193368 "['DE' 'EP' 'JP' 'SU' 'CN' 'US']"                                            44874 "['DE' 'US' 'ES' 'EP' 'WO']"          
    1226468 22329817    2  8018341 "['US' 'CA' 'JP' 'AU']"                                                     365289 "['DE' 'US' 'ES' 'EP' 'WO']"          
    1226468 10325811    2  2900472 "['GB' 'JP' 'US' 'CA' 'FR' 'DE' 'AU' 'NL' 'ZA' 'SE']"                       140683 "['DE' 'US' 'ES' 'EP' 'WO']"          
    1226468 14970234    2  4589652 "['US' 'JP']"                                                               220774 "['DE' 'US' 'ES' 'EP' 'WO']"          
    1226468  6981930    2  1582120 "['AT' 'GB' 'SE' 'DE' 'ES' 'US']"                                            67994 "['DE' 'US' 'ES' 'EP' 'WO']"          
    1226468 23907898    2 11420161 "['US']"                                                                    497081 "['DE' 'US' 'ES' 'EP' 'WO']"          
    1226468 25283772    2 14708888 "['US']"                                                                    633949 "['DE' 'US' 'ES' 'EP' 'WO']"          
    1236571 27009992 9999 17593493 "['US']"                                                                    775076 "['WO' 'DE' 'CA' 'US' 'EP']"          
    1236571 22646501 9999  8662442 "['US']"                                                                    390621 "['WO' 'DE' 'CA' 'US' 'EP']"          
    1236571  5944733 9999   881063 "['DE' 'AR' 'JP' 'SE' 'NL' 'US' 'FR' 'IT' 'CA' 'GB']"                        26063 "['WO' 'DE' 'CA' 'US' 'EP']"          
    1236571 25492752 9999 15505603 "['US']"                                                                    673175 "['WO' 'DE' 'CA' 'US' 'EP']"          
    1236571 24259740 9999 12200405 "['AU' 'DE' 'CA' 'ES' 'JP' 'US' 'EP' 'AT']"                                 527594 "['WO' 'DE' 'CA' 'US' 'EP']"          
    1236571 24605606 9999 12979630 "['US']"                                                                    559128 "['WO' 'DE' 'CA' 'US' 'EP']"          
    1236571 11052747 9999  3290685 "['US']"                                                                    159465 "['WO' 'DE' 'CA' 'US' 'EP']"          
    1236571 24839432 9999 13519604 "['DE' 'US' 'JP' 'FR' 'GB']"                                                581604 "['WO' 'DE' 'CA' 'US' 'EP']"          
    1239098 24397163 9999 12507358 "['US' 'GB' 'DE']"                                                          539939 "['WO' 'CA' 'DE' 'EP' 'US']"          
    1239098 27509152 9999 18674324 "['FR' 'IT' 'JP' 'SE' 'US' 'DE' 'NO']"                                      828605 "['WO' 'CA' 'DE' 'EP' 'US']"          
    1239098 24916732 9999 13693266 "['US']"                                                                    589110 "['WO' 'CA' 'DE' 'EP' 'US']"          
    1239098 25296619 9999 14749928 "['US']"                                                                    635837 "['WO' 'CA' 'DE' 'EP' 'US']"          
    1239098 23845079 9999 11287876 "['US' 'GB' 'DE']"                                                          491976 "['WO' 'CA' 'DE' 'EP' 'US']"          
    1239098 27054560 9999 17734538 "['US']"                                                                    782450 "['WO' 'CA' 'DE' 'EP' 'US']"          
    1239277 22418792 9999  8166145 "['CA' 'EP' 'US' 'WO']"                                                     371172 "['WO' 'DE' 'US' 'EP']"               
    1239277 22779403 9999  8971524 "['US']"                                                                    402759 "['WO' 'DE' 'US' 'EP']"               
    1239277 12578497 9999  3780887 "['JP' 'DE' 'US']"                                                          184413 "['WO' 'DE' 'US' 'EP']"               
    1239277 21994155 9999  7456852 "['US']"                                                                    342759 "['WO' 'DE' 'US' 'EP']"               
    1239277 24130698 9999 11911180 "['US']"                                                                    516202 "['WO' 'DE' 'US' 'EP']"               
    1239277 22694825 9999  8774378 "['US']"                                                                    395088 "['WO' 'DE' 'US' 'EP']"               
    1239277 24790872 9999 13409362 "['US']"                                                                    577039 "['WO' 'DE' 'US' 'EP']"               
    1239277 23081767 9999  9657651 "['US']"                                                                    429637 "['WO' 'DE' 'US' 'EP']"               
    1239277 24267003 9999 12216780 "['WO']"                                                                    528262 "['WO' 'DE' 'US' 'EP']"               
    1239277 23379411 9999 10306837 "['US' 'AU' 'FR' 'IT' 'DE' 'GB']"                                           454456 "['WO' 'DE' 'US' 'EP']"               
    1239277 23309829 9999 10154490 "['CA' 'EP' 'JP' 'US' 'AU']"                                                448715 "['WO' 'DE' 'US' 'EP']"               
    1239277 25343500 9999 14906554 "['US']"                                                                    643063 "['WO' 'DE' 'US' 'EP']"               
    1239277 27006578 9999 17583058 "['US']"                                                                    774526 "['WO' 'DE' 'US' 'EP']"               
    1239277 24841797 9999 13525191 "['US']"                                                                    581824 "['WO' 'DE' 'US' 'EP']"               
    1239277 24236071 9999 12146240 "['WO']"                                                                    525500 "['WO' 'DE' 'US' 'EP']"               
    1239483 24749080 9999 13313449 "['BE' 'US']"                                                               573051 "['DE' 'EP' 'US' 'WO']"               
    1239483 21722194 9999  6945853 "['GB' 'BE' 'DE' 'US']"                                                     321274 "['DE' 'EP' 'US' 'WO']"               
    1239483 26884023 9999 17236371 "['US']"                                                                    757573 "['DE' 'EP' 'US' 'WO']"               
    1239483 26713455 9999 16780090 "['US']"                                                                    735973 "['DE' 'EP' 'US' 'WO']"               
    1239483 27123571 9999 17966713 "['JP' 'EP' 'US' 'AU' 'WO']"                                                794497 "['DE' 'EP' 'US' 'WO']"               
    1239622 23074051 9999  9640166 "['US']"                                                                    428986 "['DE' 'EP' 'WO' 'FI' 'US']"          
    1239622 24951262 9999 13771221 "['US']"                                                                    592456 "['DE' 'EP' 'WO' 'FI' 'US']"          
    1239622 25351735 9999 14937805 "['US']"                                                                    644652 "['DE' 'EP' 'WO' 'FI' 'US']"          
    1239622 24669345 9999 13127997 "['US']"                                                                    565210 "['DE' 'EP' 'WO' 'FI' 'US']"          
    1239622  3770167 9999   418506 "['WO' 'US' 'NZ' 'SE' 'CA' 'FR' 'GB' 'FI']"                                   4869 "['DE' 'EP' 'WO' 'FI' 'US']"          
    1239622 25109266 9999 14171612 "['US']"                                                                    609301 "['DE' 'EP' 'WO' 'FI' 'US']"          
    1239622 21805004 9999  7100269 "['US']"                                                                    327953 "['DE' 'EP' 'WO' 'FI' 'US']"          
    1239622 26158421 9999 15975645 "['DE' 'WO' 'EP' 'FI' 'US' 'AT']"                                           698263 "['DE' 'EP' 'WO' 'FI' 'US']"          
    1239622  9249538 9999  2450144 "['FR' 'AT' 'EP' 'US' 'DE' 'CA']"                                           116844 "['DE' 'EP' 'WO' 'FI' 'US']"          
    1239622 23664858 9999 10908824 "['US']"                                                                    477473 "['DE' 'EP' 'WO' 'FI' 'US']"          
    1239624 27108903 9999 17913210 "['US']"                                                                    791814 "['EP' 'US' 'DE' 'WO']"               
    1239624 24904226 9999 13665639 "['US']"                                                                    587863 "['EP' 'US' 'DE' 'WO']"               
    1239749  6270984 9999  1184851 "['AU' 'ZA' 'DE' 'ES' 'EP' 'JP' 'US']"                                       44295 "['EP' 'JP' 'US' 'DE' 'WO']"          
    1239749  6245732 9999  1163512 "['US' 'EP' 'DE' 'ZA']"                                                      42887 "['EP' 'JP' 'US' 'DE' 'WO']"          
    1239749  6313094 9999  1232872 "['ZA' 'ES' 'DE' 'EP' 'US']"                                                 47361 "['EP' 'JP' 'US' 'DE' 'WO']"          
    1239749  6357314 9999  1287259 "['DE' 'CA' 'EP' 'US' 'JP']"                                                 50820 "['EP' 'JP' 'US' 'DE' 'WO']"          
    1239749  6323632 9999  1244658 "['US' 'ZA' 'EP' 'DE' 'CA' 'JP']"                                            48117 "['EP' 'JP' 'US' 'DE' 'WO']"          
    1334477  6032897 9999   957759 "['BR' 'US' 'DE' 'JP']"                                                      30159 "['DE']"                              
    1340405 24636228 9999 13050492 "['US']"                                                                    562044 "['US' 'EP' 'WO']"                    
    1340405 22536768 9999  8413302 "['US']"                                                                    380876 "['US' 'EP' 'WO']"                    
    1340405 23111794 9999  9724685 "['BE' 'US']"                                                               432299 "['US' 'EP' 'WO']"                    
    1340405 24148953 9999 11951976 "['US' 'CA']"                                                               517795 "['US' 'EP' 'WO']"                    
    1340405 24483064 9999 12699020 "['US' 'GB']"                                                               547732 "['US' 'EP' 'WO']"                    
    1340418 23758517 9999 11106972 "['US']"                                                                    485143 "['EP' 'WO' 'US' 'GB']"               
    1340418 22468583 9999  8265030 "['DE' 'AU' 'US' 'FR']"                                                     375134 "['EP' 'WO' 'US' 'GB']"               
    1340418  9164095 9999  2396334 "['ES' 'BE' 'FR']"                                                          112895 "['EP' 'WO' 'US' 'GB']"               
    1340462 27088150 9999 17845450 "['US']"                                                                    788287 "['WO' 'US' 'EP']"                    
    1340462 26223504 9999 15999583 "['US' 'EP' 'DE']"                                                          699541 "['WO' 'US' 'EP']"                    
    1340462 25476838 9999 15439464 "['US']"                                                                    669851 "['WO' 'US' 'EP']"                    
    1340462 26685774 9999 16699630 "['US']"                                                                    732450 "['WO' 'US' 'EP']"                    
    1340462 24282964 9999 12252401 "['US']"                                                                    529694 "['WO' 'US' 'EP']"                    
    1340471  3770297 9999   418792 "['US' 'NZ' 'WO' 'EP' 'DK' 'FI' 'CA']"                                        4881 "['US' 'WO' 'EP']"                    
    1340471 23100861 9999  9699748 "['US']"                                                                    431309 "['US' 'WO' 'EP']"                    
    1340471 25448672 9999 15320525 "['AT' 'EP' 'CA' 'US' 'DE']"                                                663848 "['US' 'WO' 'EP']"                    
    1340471 24060758 9999 11757343 "['CA' 'US']"                                                               510086 "['US' 'WO' 'EP']"                    
    1340485 25401922 9999 15126186 "['US']"                                                                    654061 "['EP' 'JP' 'AU' 'US' 'CA' 'WO']"     
    1340485 26858065 9999 17168369 "['US']"                                                                    754346 "['EP' 'JP' 'AU' 'US' 'CA' 'WO']"     
    1340485 21808322 9999  7106943 "['US']"                                                                    328216 "['EP' 'JP' 'AU' 'US' 'CA' 'WO']"     
    1340488 24505095 9999 12747509 "['EP' 'JP' 'CA' 'DE' 'US']"                                                549698 "['WO' 'CA' 'EP' 'US']"               
    1340488 23070753 9999  9632527 "['ES' 'KR' 'JP' 'EP' 'PT' 'AU' 'US' 'DE' 'GR' 'AT' 'CA' 'MX']"             428689 "['WO' 'CA' 'EP' 'US']"               
    1340488 24210016 9999 12088517 "['US' 'JP']"                                                               523192 "['WO' 'CA' 'EP' 'US']"               
    1340488 14043576 9999  4262230 "['JP' 'AT' 'CA' 'DE' 'AU' 'EP' 'YU' 'US']"                                 206064 "['WO' 'CA' 'EP' 'US']"               
    1340488 26335550 9999 16130561 "['US' 'GB']"                                                               705865 "['WO' 'CA' 'EP' 'US']"               
    1340488 22013480 9999  7488780 "['US']"                                                                    344045 "['WO' 'CA' 'EP' 'US']"               
    1340488 23360636 9999 10266586 "['CA' 'US']"                                                               452987 "['WO' 'CA' 'EP' 'US']"               
    end
    I would like to create a dummy taking a value of 1 whenever in ctry_cited_docdb and in ctry_docdb appear the word "CN". How can I do that?

  • #2
    Code:
    gen wanted = strpos(ctry_docdb, "CN") > 0

    Comment


    • #3
      Nick Cox doesn't this create a dummy where "CN" appears only in ctry_docdb? I would like instead a dummy where "CN" appears in both ctry_cited_docdb and ctry_docdb

      Comment


      • #4
        In that case:
        Code:
        gen byte wanted = strpos(ctry_cited_docdb, "CN") > 0 & strpos(ctry_docdb, "CN") > 0

        Comment


        • #5
          Sorry, I was just showing basic technique and presuming that extensions are immediately obvious.

          Hemanshu Kumar's helpful code raises one nuance and one key point.

          The nuance is that

          Code:
          strpos(ctry_cited_docdb, "CN") & strpos(ctry_docdb, "CN")
          -- omitting >0 -- is exactly equivalent as non-zero in Stata counts as true.

          The key point is for people reading this to be sure what you mean. People often use AND mentally and verbally when they want OR in a software solution. That is, the question looks like

          observations in which (CN is in one variable AND CN is in the other variable)

          which is not at all the same as

          (observations in which CN is in one variable) AND (observations in which CN is in the other variable).

          It is the difference between the intersection of two sets and their union. If anyone really wanted the second, they need | as an operator, not &.
          Last edited by Nick Cox; 05 Nov 2022, 07:24.

          Comment


          • #6
            Thanks a lot for the very useful replies

            Comment

            Working...
            X