Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Thanks. In the master dataset, we just need the firm names. So run

    Code:
    contract firm_name
    sort firm_name
    dataex
    You can also do the same for the using dataset. As part of the sample, we can select firms that start with a letter in the alphabet.

    Code:
    contract firm_name alliance_count
    sort firm_name
    dataex if regexm(firm_name, "^[a-zA-Z]")
    Then post the results from the above.

    Comment


    • #17
      Thanks Andrew


      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str46 firm_name
      ""                                
      "1-800-FLOWERS.COM"               
      "1st Source Corporation"          
      "2U, Inc."                        
      "3D Systems Corp."                
      "3M Corporation"                  
      "6D Global Technologies Inc"      
      "8x8 Inc."                        
      "A. O. Smith"                     
      "A.Schulman, Inc"                 
      "A10 Networks Inc"                
      "AAON Inc."                       
      "AAR Corporation"                 
      "ABM Industries"                  
      "ACI Worldwide Inc"               
      "ADT Inc."                        
      "ADTRAN Inc."                     
      "AECOM Technology Corporation"    
      "AEP Industries Inc."             
      "AES Corp"                        
      "AFLAC Inc."                      
      "AG Mortgage Investment Trust Inc"
      "AGCO Corporation"                
      "AGL Resources Inc."              
      "AGNC Investment Corp."           
      "AK Steel Holding Corporation"    
      "ALLETE Inc"                      
      "AMAG Pharmaceuticals, Inc."      
      "AMC Entertainment Holdings Inc"  
      "AMC Networks"                    
      "AMERCO"                          
      "AMN Healthcare Services Inc."    
      "AMR Corporation"                 
      "ANI Pharmaceuticals Inc"         
      "ANSYS Inc."                      
      "APA Corporation"                 
      "ARC Document Solutions Inc"      
      "ARIAD Pharmaceuticals, Inc."     
      "ARMOUR Residential REIT, Inc."   
      "ARRIS Group, Inc."               
      "AT&T"                            
      "AT&T Inc."                       
      "AV Homes Inc"                    
      "AVG Technologies NV"             
      "AVX Corporation"                 
      "AXA Equitable Holdings"          
      "AXIS Capital"                    
      "AZZ Incorporated"                
      "Aaron's Inc."                    
      "Abaxis Inc."                     
      "AbbVie"                          
      "Abbott Laboratories"             
      "Abeona Therapeutics Inc"         
      "Abercrombie & Fitch"             
      "Abiomed, Inc."                   
      "Abraxas Petroleum Corp."         
      "Acacia Research Corporation"     
      "Acadia Healthcare Co."           
      "Acadia Realty Trust"             
      "Accelerate Diagnostics Inc"      
      "Acceleron Pharma Inc."           
      "Accenture plc"                   
      "Access National Corp"            
      "Accuray Incorporated"            
      "Accuride Corporation"            
      "Aceto Corp."                     
      "Achillion Pharmaceuticals Inc."  
      "Acorda Therapeutics, Inc."       
      "Activision Blizzard, Inc."       
      "Actuant Corporation"             
      "Acuity Brands, Inc."             
      "Acxiom Corporation"              
      "Adamas Pharmaceuticals Inc"      
      "Adams Resources & Energy Inc."   
      "Addus Homecare Corporation"      
      "Adeptus Health Inc"              
      "Adient"                          
      "Adobe Systems Inc."              
      "Aduro BioTech Inc"               
      "Advance Auto Parts Inc."         
      "Advanced Drainage Systems Inc"   
      "Advanced Micro Devices, Inc."    
      "Advaxis Inc."                    
      "Advent Software, Inc."           
      "Adverum Biotechnologies, Inc."   
      "Advisory Board Company"          
      "Aegerion Pharmaceuticals, Inc."  
      "Aegion Corp"                     
      "Aerie Pharmaceuticals Inc"       
      "AeroVironment, Inc."             
      "Aerohive Networks Inc"           
      "Aerojet Rocketdyne Holdings Inc."
      "Aetna Inc."                      
      "Affiliated Managers Group Inc."  
      "Affimed NV"                      
      "Affymetrix Inc."                 
      "Agenus Inc."                     
      "Agile Therapeutics Inc"          
      "Agilent Technologies Inc."       
      "Agilysys Inc."                   
      end
      ------------------ copy up to and including the previous line ------------------

      Listed 100 out of 1685 observations
      Use the count() option to list more
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str30 firm_name float alliance_count byte _freq
      "A La Mode Inc"                  3 1
      "A PC Geek"                      1 1
      "A Say Inc"                      1 1
      "A World Beneath"                1 1
      "A&A Dukaan Finl Svcs Pvt Ltd"   1 1
      "A&E Engineering Inc"            1 1
      "A&E Television Networks LLC"    4 1
      "A&L Canada Laboratories Inc"    2 1
      "A+B Expedio"                    1 1
      "A-Insinoorit Oy"                1 1
      "A.U.L. Corp"                    1 1
      "A1 Investments & Resources Ltd" 1 1
      "A10 Networks Inc"               1 1
      "A123 Systems Inc"               1 1
      "A15"                            1 1
      "A2iA Corp"                      1 1
      "A360 Media LLC"                 1 1
      "A3logics India Ltd"             1 1
      "A4 Bar Cattle Co Ltd"           1 1
      "A4 Media"                       1 1
      "A8 Digital Music Holdings Ltd"  1 1
      "AA PLC"                         3 1
      "AAA Taranis Visual Ltd"         1 1
      "AAI Unmanned Aircraft Systems"  1 1
      "AAR"                            1 1
      "AAR Corp"                       6 1
      "AB Bank Ltd"                    1 1
      "AB Group"                       1 1
      "ABB India Ltd"                  1 1
      "ABB Ltd"                        3 1
      "ABBYY Solutions Ltd"            1 1
      "ABBYY USA Software House Inc"   5 1
      "ABC Bearings Ltd"               1 1
      "ABC Consultants Pvt Ltd"        1 1
      "ABC Data SA"                    1 1
      "ABC Financial Services Inc"     3 1
      "ABC News"                       1 1
      "ABC Services Inc"               1 1
      "ABC Technologies Inc"           1 1
      "ABM Industries Inc"             3 1
      "ABN AMRO Bank NV"               1 1
      "AC Industrial Tech Hldg Inc"    1 1
      "AC Lordi Consulting LLC"        1 1
      "ACAE"                           1 1
      "ACCEPT EXPRESS PLC"             1 1
      "ACCESS China Inc"               1 1
      "ACD Systems International Inc"  1 1
      "ACE Assoc Compiler Experts BV"  1 1
      "ACE Group of Insurance & Reins" 1 1
      "ACG Group Pvt Ltd"              1 1
      "ACH Alert LLC"                  2 1
      "ACH Direct Inc"                 1 1
      "ACI Worldwide Inc"              3 1
      "ACL Services Ltd"               1 1
      "ACNE AB"                        1 1
      "ACNielsen Inc"                  2 1
      "ACOM Solutions Inc"             1 1
      "ACUSIM Software"                1 1
      "AD Capital US Inc"              1 1
      "AD/MAX"                         1 1
      "ADAC Verlag GmbH"               1 1
      "ADAGRI"                         1 1
      "ADAM Software BV"               1 1
      "ADAM Software Inc"              1 1
      "ADCORE Inc"                     1 1
      "ADFITECH Inc"                   1 1
      "ADG Global Supply Ltd"          1 1
      "ADM Technologies Pvt Ltd"       2 1
      "ADNOC Distribution"             2 1
      "ADP"                            1 1
      "ADPG Media Group Co Ltd"        1 1
      "ADT Inc"                        1 1
      "ADTRAN Inc"                     2 1
      "ADVA AG Optical Networking"     1 1
      "ADVIZOR Solutions Inc"          1 1
      "ADVO Inc"                       1 1
      "ADY"                            1 1
      "ADmore Inc"                     1 1
      "AEC Industrial Solutions Ltd"   1 1
      "AEG"                            4 1
      "AEG Live LLC"                   2 1
      "AEK Energie AG"                 1 1
      "AEON Fantasy Co Ltd"            1 1
      "AEP Networks"                   1 1
      "AEV Technologies Inc"           2 1
      "AEye Inc"                       1 1
      "AFAI Sthrn Shipyard Ltd"        1 1
      "AFGRI Ltd-Agricultural Retail"  1 1
      "AFLG Invests Pte Eq"            1 1
      "AG Global"                      1 1
      "AGA Resources Inc"              1 1
      "AGCO Corp"                      3 1
      "AGEIA Technologies Inc"         2 1
      "AGN Networks Inc"               1 1
      "AGS LLC"                        1 1
      "AGS Transact Technologies Ltd"  1 1
      "AGT Group GmbH"                 1 1
      "AGT International Inc"          1 1
      "AGTech Holdings Ltd"            1 1
      "AGTech Media Holdings Limited"  1 1
      end
      ------------------ copy up to and including the previous line ------------------

      Listed 100 out of 18582 observations
      Use the count() option to list more



      Comment


      • #18
        You are applying the same code in #7, except that I find that you need to be very conservative when choosing the cutoff and pick a high value, otherwise you end up with incorrect matches. I will illustrate this in a second post as this post will be very long after appending all the data.

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str30 firm_name float alliance_count byte _freq
        "A La Mode Inc"                  3 1
        "A PC Geek"                      1 1
        "A Say Inc"                      1 1
        "A World Beneath"                1 1
        "A&A Dukaan Finl Svcs Pvt Ltd"   1 1
        "A&E Engineering Inc"            1 1
        "A&E Television Networks LLC"    4 1
        "A&L Canada Laboratories Inc"    2 1
        "A+B Expedio"                    1 1
        "A-Insinoorit Oy"                1 1
        "A.U.L. Corp"                    1 1
        "A1 Investments & Resources Ltd" 1 1
        "A10 Networks Inc"               1 1
        "A123 Systems Inc"               1 1
        "A15"                            1 1
        "A2iA Corp"                      1 1
        "A360 Media LLC"                 1 1
        "A3logics India Ltd"             1 1
        "A4 Bar Cattle Co Ltd"           1 1
        "A4 Media"                       1 1
        "A8 Digital Music Holdings Ltd"  1 1
        "AA PLC"                         3 1
        "AAA Taranis Visual Ltd"         1 1
        "AAI Unmanned Aircraft Systems"  1 1
        "AAR"                            1 1
        "AAR Corp"                       6 1
        "AB Bank Ltd"                    1 1
        "AB Group"                       1 1
        "ABB India Ltd"                  1 1
        "ABB Ltd"                        3 1
        "ABBYY Solutions Ltd"            1 1
        "ABBYY USA Software House Inc"   5 1
        "ABC Bearings Ltd"               1 1
        "ABC Consultants Pvt Ltd"        1 1
        "ABC Data SA"                    1 1
        "ABC Financial Services Inc"     3 1
        "ABC News"                       1 1
        "ABC Services Inc"               1 1
        "ABC Technologies Inc"           1 1
        "ABM Industries Inc"             3 1
        "ABN AMRO Bank NV"               1 1
        "AC Industrial Tech Hldg Inc"    1 1
        "AC Lordi Consulting LLC"        1 1
        "ACAE"                           1 1
        "ACCEPT EXPRESS PLC"             1 1
        "ACCESS China Inc"               1 1
        "ACD Systems International Inc"  1 1
        "ACE Assoc Compiler Experts BV"  1 1
        "ACE Group of Insurance & Reins" 1 1
        "ACG Group Pvt Ltd"              1 1
        "ACH Alert LLC"                  2 1
        "ACH Direct Inc"                 1 1
        "ACI Worldwide Inc"              3 1
        "ACL Services Ltd"               1 1
        "ACNE AB"                        1 1
        "ACNielsen Inc"                  2 1
        "ACOM Solutions Inc"             1 1
        "ACUSIM Software"                1 1
        "AD Capital US Inc"              1 1
        "AD/MAX"                         1 1
        "ADAC Verlag GmbH"               1 1
        "ADAGRI"                         1 1
        "ADAM Software BV"               1 1
        "ADAM Software Inc"              1 1
        "ADCORE Inc"                     1 1
        "ADFITECH Inc"                   1 1
        "ADG Global Supply Ltd"          1 1
        "ADM Technologies Pvt Ltd"       2 1
        "ADNOC Distribution"             2 1
        "ADP"                            1 1
        "ADPG Media Group Co Ltd"        1 1
        "ADT Inc"                        1 1
        "ADTRAN Inc"                     2 1
        "ADVA AG Optical Networking"     1 1
        "ADVIZOR Solutions Inc"          1 1
        "ADVO Inc"                       1 1
        "ADY"                            1 1
        "ADmore Inc"                     1 1
        "AEC Industrial Solutions Ltd"   1 1
        "AEG"                            4 1
        "AEG Live LLC"                   2 1
        "AEK Energie AG"                 1 1
        "AEON Fantasy Co Ltd"            1 1
        "AEP Networks"                   1 1
        "AEV Technologies Inc"           2 1
        "AEye Inc"                       1 1
        "AFAI Sthrn Shipyard Ltd"        1 1
        "AFGRI Ltd-Agricultural Retail"  1 1
        "AFLG Invests Pte Eq"            1 1
        "AG Global"                      1 1
        "AGA Resources Inc"              1 1
        "AGCO Corp"                      3 1
        "AGEIA Technologies Inc"         2 1
        "AGN Networks Inc"               1 1
        "AGS LLC"                        1 1
        "AGS Transact Technologies Ltd"  1 1
        "AGT Group GmbH"                 1 1
        "AGT International Inc"          1 1
        "AGTech Holdings Ltd"            1 1
        "AGTech Media Holdings Limited"  1 1
        end
        
        rename firm_name firmname2
        tempfile usingfile2
        save `usingfile2'
        
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str46 firm_name
        ""                                
        "1-800-FLOWERS.COM"              
        "1st Source Corporation"          
        "2U, Inc."                        
        "3D Systems Corp."                
        "3M Corporation"                  
        "6D Global Technologies Inc"      
        "8x8 Inc."                        
        "A. O. Smith"                    
        "A.Schulman, Inc"                
        "A10 Networks Inc"                
        "AAON Inc."                      
        "AAR Corporation"                
        "ABM Industries"                  
        "ACI Worldwide Inc"              
        "ADT Inc."                        
        "ADTRAN Inc."                    
        "AECOM Technology Corporation"    
        "AEP Industries Inc."            
        "AES Corp"                        
        "AFLAC Inc."                      
        "AG Mortgage Investment Trust Inc"
        "AGCO Corporation"                
        "AGL Resources Inc."              
        "AGNC Investment Corp."          
        "AK Steel Holding Corporation"    
        "ALLETE Inc"                      
        "AMAG Pharmaceuticals, Inc."      
        "AMC Entertainment Holdings Inc"  
        "AMC Networks"                    
        "AMERCO"                          
        "AMN Healthcare Services Inc."    
        "AMR Corporation"                
        "ANI Pharmaceuticals Inc"        
        "ANSYS Inc."                      
        "APA Corporation"                
        "ARC Document Solutions Inc"      
        "ARIAD Pharmaceuticals, Inc."    
        "ARMOUR Residential REIT, Inc."  
        "ARRIS Group, Inc."              
        "AT&T"                            
        "AT&T Inc."                      
        "AV Homes Inc"                    
        "AVG Technologies NV"            
        "AVX Corporation"                
        "AXA Equitable Holdings"          
        "AXIS Capital"                    
        "AZZ Incorporated"                
        "Aaron's Inc."                    
        "Abaxis Inc."                    
        "AbbVie"                          
        "Abbott Laboratories"            
        "Abeona Therapeutics Inc"        
        "Abercrombie & Fitch"            
        "Abiomed, Inc."                  
        "Abraxas Petroleum Corp."        
        "Acacia Research Corporation"    
        "Acadia Healthcare Co."          
        "Acadia Realty Trust"            
        "Accelerate Diagnostics Inc"      
        "Acceleron Pharma Inc."          
        "Accenture plc"                  
        "Access National Corp"            
        "Accuray Incorporated"            
        "Accuride Corporation"            
        "Aceto Corp."                    
        "Achillion Pharmaceuticals Inc."  
        "Acorda Therapeutics, Inc."      
        "Activision Blizzard, Inc."      
        "Actuant Corporation"            
        "Acuity Brands, Inc."            
        "Acxiom Corporation"              
        "Adamas Pharmaceuticals Inc"      
        "Adams Resources & Energy Inc."  
        "Addus Homecare Corporation"      
        "Adeptus Health Inc"              
        "Adient"                          
        "Adobe Systems Inc."              
        "Aduro BioTech Inc"              
        "Advance Auto Parts Inc."        
        "Advanced Drainage Systems Inc"  
        "Advanced Micro Devices, Inc."    
        "Advaxis Inc."                    
        "Advent Software, Inc."          
        "Adverum Biotechnologies, Inc."  
        "Advisory Board Company"          
        "Aegerion Pharmaceuticals, Inc."  
        "Aegion Corp"                    
        "Aerie Pharmaceuticals Inc"      
        "AeroVironment, Inc."            
        "Aerohive Networks Inc"          
        "Aerojet Rocketdyne Holdings Inc."
        "Aetna Inc."                      
        "Affiliated Managers Group Inc."  
        "Affimed NV"                      
        "Affymetrix Inc."                
        "Agenus Inc."                    
        "Agile Therapeutics Inc"          
        "Agilent Technologies Inc."      
        "Agilysys Inc."                  
        end
        tempfile master
        save `master'
        cross using `usingfile2'
        matchit firm_name firmname2, g(score)
        keep if score>0.9
        tempfile matches
        rename (firm_name firmname2) (firmname2 firm_name)
        bys firm_name (score): keep if _n==_N
        save `matches'
        
        *GO BACK AND MERGE WITH USING DATASET
        use `usingfile2', clear
        rename firmname2 firm_name
        merge 1:m firm_name using `matches', nogen
        gen oldnames= firm_name if !missing(firmname2)
        
        *REPLACE FIRM_NAMES WITH MATCHES
        replace firm_name= firmname2 if !missing(firmname2)
        drop firmname2 score
        ds oldnames, not
        bys `r(varlist)': keep if _n==1
        save `usingfile1', replace
        
        *NOW MERGE WITH MASTER FILE
        use `master', clear
        merge m:1 firm_name using `usingfile2', keep(master match) nogen
        sort firm_name
        l, sepby(firm_name)
        Res.:

        Code:
        . l, sepby(firm_name)
        
             +--------------------------------------------------------------------------+
             |                        firm_name   allian~t   _freq             oldnames |
             |--------------------------------------------------------------------------|
          1. |                                           .       .                      |
             |--------------------------------------------------------------------------|
          2. |                1-800-FLOWERS.COM          .       .                      |
             |--------------------------------------------------------------------------|
          3. |           1st Source Corporation          .       .                      |
             |--------------------------------------------------------------------------|
          4. |                         2U, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
          5. |                 3D Systems Corp.          .       .                      |
             |--------------------------------------------------------------------------|
          6. |                   3M Corporation          .       .                      |
             |--------------------------------------------------------------------------|
          7. |       6D Global Technologies Inc          .       .                      |
             |--------------------------------------------------------------------------|
          8. |                         8x8 Inc.          .       .                      |
             |--------------------------------------------------------------------------|
          9. |                      A. O. Smith          .       .                      |
             |--------------------------------------------------------------------------|
         10. |                  A.Schulman, Inc          .       .                      |
             |--------------------------------------------------------------------------|
        11. |                 A10 Networks Inc          1       1     A10 Networks Inc |
             |--------------------------------------------------------------------------|
         12. |                        AAON Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         13. |                  AAR Corporation          .       .                      |
             |--------------------------------------------------------------------------|
        14. |                   ABM Industries          3       1   ABM Industries Inc |
             |--------------------------------------------------------------------------|
         15. |                ACI Worldwide Inc          3       1    ACI Worldwide Inc |
             |--------------------------------------------------------------------------|
         16. |                         ADT Inc.          1       1              ADT Inc |
             |--------------------------------------------------------------------------|
         17. |                      ADTRAN Inc.          2       1           ADTRAN Inc |
             |--------------------------------------------------------------------------|
         18. |     AECOM Technology Corporation          .       .                      |
             |--------------------------------------------------------------------------|
         19. |              AEP Industries Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         20. |                         AES Corp          .       .                      |
             |--------------------------------------------------------------------------|
         21. |                       AFLAC Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         22. | AG Mortgage Investment Trust Inc          .       .                      |
             |--------------------------------------------------------------------------|
         23. |                 AGCO Corporation          .       .                      |
             |--------------------------------------------------------------------------|
         24. |               AGL Resources Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         25. |            AGNC Investment Corp.          .       .                      |
             |--------------------------------------------------------------------------|
         26. |     AK Steel Holding Corporation          .       .                      |
             |--------------------------------------------------------------------------|
         27. |                       ALLETE Inc          .       .                      |
             |--------------------------------------------------------------------------|
         28. |       AMAG Pharmaceuticals, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         29. |   AMC Entertainment Holdings Inc          .       .                      |
             |--------------------------------------------------------------------------|
         30. |                     AMC Networks          .       .                      |
             |--------------------------------------------------------------------------|
         31. |                           AMERCO          .       .                      |
             |--------------------------------------------------------------------------|
         32. |     AMN Healthcare Services Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         33. |                  AMR Corporation          .       .                      |
             |--------------------------------------------------------------------------|
         34. |          ANI Pharmaceuticals Inc          .       .                      |
             |--------------------------------------------------------------------------|
         35. |                       ANSYS Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         36. |                  APA Corporation          .       .                      |
             |--------------------------------------------------------------------------|
         37. |       ARC Document Solutions Inc          .       .                      |
             |--------------------------------------------------------------------------|
         38. |      ARIAD Pharmaceuticals, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         39. |    ARMOUR Residential REIT, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         40. |                ARRIS Group, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         41. |                             AT&T          .       .                      |
             |--------------------------------------------------------------------------|
         42. |                        AT&T Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         43. |                     AV Homes Inc          .       .                      |
             |--------------------------------------------------------------------------|
         44. |              AVG Technologies NV          .       .                      |
             |--------------------------------------------------------------------------|
         45. |                  AVX Corporation          .       .                      |
             |--------------------------------------------------------------------------|
         46. |           AXA Equitable Holdings          .       .                      |
             |--------------------------------------------------------------------------|
         47. |                     AXIS Capital          .       .                      |
             |--------------------------------------------------------------------------|
         48. |                 AZZ Incorporated          .       .                      |
             |--------------------------------------------------------------------------|
         49. |                     Aaron's Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         50. |                      Abaxis Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         51. |                           AbbVie          .       .                      |
             |--------------------------------------------------------------------------|
         52. |              Abbott Laboratories          .       .                      |
             |--------------------------------------------------------------------------|
         53. |          Abeona Therapeutics Inc          .       .                      |
             |--------------------------------------------------------------------------|
         54. |              Abercrombie & Fitch          .       .                      |
             |--------------------------------------------------------------------------|
         55. |                    Abiomed, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         56. |          Abraxas Petroleum Corp.          .       .                      |
             |--------------------------------------------------------------------------|
         57. |      Acacia Research Corporation          .       .                      |
             |--------------------------------------------------------------------------|
         58. |            Acadia Healthcare Co.          .       .                      |
             |--------------------------------------------------------------------------|
         59. |              Acadia Realty Trust          .       .                      |
             |--------------------------------------------------------------------------|
         60. |       Accelerate Diagnostics Inc          .       .                      |
             |--------------------------------------------------------------------------|
         61. |            Acceleron Pharma Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         62. |                    Accenture plc          .       .                      |
             |--------------------------------------------------------------------------|
         63. |             Access National Corp          .       .                      |
             |--------------------------------------------------------------------------|
         64. |             Accuray Incorporated          .       .                      |
             |--------------------------------------------------------------------------|
         65. |             Accuride Corporation          .       .                      |
             |--------------------------------------------------------------------------|
         66. |                      Aceto Corp.          .       .                      |
             |--------------------------------------------------------------------------|
         67. |   Achillion Pharmaceuticals Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         68. |        Acorda Therapeutics, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         69. |        Activision Blizzard, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         70. |              Actuant Corporation          .       .                      |
             |--------------------------------------------------------------------------|
         71. |              Acuity Brands, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         72. |               Acxiom Corporation          .       .                      |
             |--------------------------------------------------------------------------|
         73. |       Adamas Pharmaceuticals Inc          .       .                      |
             |--------------------------------------------------------------------------|
         74. |    Adams Resources & Energy Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         75. |       Addus Homecare Corporation          .       .                      |
             |--------------------------------------------------------------------------|
         76. |               Adeptus Health Inc          .       .                      |
             |--------------------------------------------------------------------------|
         77. |                           Adient          .       .                      |
             |--------------------------------------------------------------------------|
         78. |               Adobe Systems Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         79. |                Aduro BioTech Inc          .       .                      |
             |--------------------------------------------------------------------------|
         80. |          Advance Auto Parts Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         81. |    Advanced Drainage Systems Inc          .       .                      |
             |--------------------------------------------------------------------------|
         82. |     Advanced Micro Devices, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         83. |                     Advaxis Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         84. |            Advent Software, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         85. |    Adverum Biotechnologies, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         86. |           Advisory Board Company          .       .                      |
             |--------------------------------------------------------------------------|
         87. |   Aegerion Pharmaceuticals, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         88. |                      Aegion Corp          .       .                      |
             |--------------------------------------------------------------------------|
         89. |        Aerie Pharmaceuticals Inc          .       .                      |
             |--------------------------------------------------------------------------|
         90. |              AeroVironment, Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         91. |            Aerohive Networks Inc          .       .                      |
             |--------------------------------------------------------------------------|
         92. | Aerojet Rocketdyne Holdings Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         93. |                       Aetna Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         94. |   Affiliated Managers Group Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         95. |                       Affimed NV          .       .                      |
             |--------------------------------------------------------------------------|
         96. |                  Affymetrix Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         97. |                      Agenus Inc.          .       .                      |
             |--------------------------------------------------------------------------|
         98. |           Agile Therapeutics Inc          .       .                      |
             |--------------------------------------------------------------------------|
         99. |        Agilent Technologies Inc.          .       .                      |
             |--------------------------------------------------------------------------|
        100. |                    Agilysys Inc.          .       .                      |
             +--------------------------------------------------------------------------+
        
        .
        Here, we have only 5 matches from the using file (highlighted). The rest of the firms will have missing values. If you want to keep only matches, change the third to last line of the code to

        Code:
        merge m:1 firm_name using `usingfile2', keep(match) nogen

        Comment


        • #19
          Running the code in #18 up to

          Code:
          tempfile master
          save `master'
          cross using `usingfile2'
          matchit firm_name firmname2, g(score)
          we can view matches at different similarity scores. I will start at 50%, then 75% and finally 90%. Incorrect matches in the former two are highlighted in blue. The latter yields no incorrect matches, but risks omitting some correct matches (see highlighted in red).

          Code:
          list if score>0.5, sep(0)
          list if score>0.75, sep(0)
          list if score>0.9, sep(0)
          Res.:

          Code:
          . list if score>0.5, sep(0)
          
                 +-----------------------------------------------------------------------------------------------+
                 |                     firm_name                        firmname2   allian~t   _freq       score |
                 |-----------------------------------------------------------------------------------------------|
            509. |              3D Systems Corp.                 A123 Systems Inc          1       1   .53333333 |
            732. |    6D Global Technologies Inc             ABC Technologies Inc          1       1   .75056834 |
            761. |    6D Global Technologies Inc         ADM Technologies Pvt Ltd          2       1   .56180065 |
            778. |    6D Global Technologies Inc             AEV Technologies Inc          2       1   .75056834 |
            786. |    6D Global Technologies Inc           AGEIA Technologies Inc          2       1   .71393289 |
            789. |    6D Global Technologies Inc    AGS Transact Technologies Ltd          1       1   .52704628 |
           1102. |              A10 Networks Inc                 A10 Networks Inc          1       1           1 |
           1173. |              A10 Networks Inc                     AEP Networks          1       1   .62279916 |
           1183. |              A10 Networks Inc                 AGN Networks Inc          1       1          .8 |
           1313. |               AAR Corporation                         AAR Corp          6       1   .75592895 |
           1426. |                ABM Industries               ABM Industries Inc          3       1    .9078413 |
           1428. |                ABM Industries      AC Industrial Tech Hldg Inc          1       1   .50636968 |
           1538. |             ACI Worldwide Inc                ACI Worldwide Inc          3       1           1 |
           1649. |                      ADT Inc.                       ADCORE Inc          1       1   .50395263 |
           1656. |                      ADT Inc.                          ADT Inc          1       1    .9258201 |
           1657. |                      ADT Inc.                       ADTRAN Inc          2       1   .62994079 |
           1660. |                      ADT Inc.                         ADVO Inc          1       1   .57142857 |
           1662. |                      ADT Inc.                       ADmore Inc          1       1   .50395263 |
           1755. |                   ADTRAN Inc.                          ADT Inc          1       1   .64549722 |
           1756. |                   ADTRAN Inc.                       ADTRAN Inc          2       1    .9486833 |
          1921. |           AEP Industries Inc.               ABM Industries Inc          3       1   .83743579 |
           1923. |           AEP Industries Inc.      AC Industrial Tech Hldg Inc          1       1   .58387421 |
           1996. |                      AES Corp                        A2iA Corp          1       1   .53452248 |
           2006. |                      AES Corp                         AAR Corp          6       1   .57142857 |
           2072. |                      AES Corp                        AGCO Corp          3       1   .53452248 |
          2369. |              AGCO Corporation                        AGCO Corp          3       1   .77174363 |
          2388. |            AGL Resources Inc.   A1 Investments & Resources Ltd          1       1   .64116714 |
           2467. |            AGL Resources Inc.                AGA Resources Inc          1       1   .86518091 |
           2983. |                  AMC Networks                 A10 Networks Inc          1       1   .62279916 |
           3054. |                  AMC Networks                     AEP Networks          1       1   .72727273 |
           3064. |                  AMC Networks                 AGN Networks Inc          1       1   .62279916 |
           3204. |  AMN Healthcare Services Inc.       ABC Financial Services Inc          3       1   .51851852 |
           3206. |  AMN Healthcare Services Inc.                 ABC Services Inc          1       1   .59628479 |
           3293. |               AMR Corporation                         AAR Corp          6       1   .56694671 |
           3580. |               APA Corporation                        A2iA Corp          1       1   .53033009 |
           3720. |    ARC Document Solutions Inc               ACOM Solutions Inc          1       1   .63059263 |
           3738. |    ARC Document Solutions Inc            ADVIZOR Solutions Inc          1       1   .58137767 |
           3742. |    ARC Document Solutions Inc     AEC Industrial Solutions Ltd          1       1   .50037023 |
           4230. |                     AT&T Inc.                          ADT Inc          1       1   .57735027 |
           4395. |           AVG Technologies NV             ABC Technologies Inc          1       1   .70295949 |
           4424. |           AVG Technologies NV         ADM Technologies Pvt Ltd          2       1   .63891514 |
           4441. |           AVG Technologies NV             AEV Technologies Inc          2       1   .70295949 |
           4449. |           AVG Technologies NV           AGEIA Technologies Inc          2       1   .66864785 |
           4452. |           AVG Technologies NV    AGS Transact Technologies Ltd          1       1   .60246408 |
           4712. |                  AXIS Capital                AD Capital US Inc          1       1   .60302269 |
           5156. |           Abbott Laboratories      A&L Canada Laboratories Inc          2       1   .60436722 |
           7338. | Adams Resources & Energy Inc.   A1 Investments & Resources Ltd          1       1    .6103001 |
           7417. | Adams Resources & Energy Inc.                AGA Resources Inc          1       1   .66666667 |
           7736. |            Adobe Systems Inc.                 A123 Systems Inc          1       1   .68884672 |
           7769. |            Adobe Systems Inc.    ACD Systems International Inc          1       1   .56591646 |
           8033. | Advanced Drainage Systems Inc                 A123 Systems Inc          1       1   .56568542 |
           8066. | Advanced Drainage Systems Inc    ACD Systems International Inc          1       1   .51729353 |
           8380. |         Advent Software, Inc.                ADAM Software Inc          1       1   .61491869 |
           8454. | Adverum Biotechnologies, Inc.             ABC Technologies Inc          1       1   .56362148 |
           8500. | Adverum Biotechnologies, Inc.             AEV Technologies Inc          2       1   .56362148 |
           8508. | Adverum Biotechnologies, Inc.           AGEIA Technologies Inc          2       1   .53611096 |
           9022. |         Aerohive Networks Inc                 A10 Networks Inc          1       1   .69282032 |
           9093. |         Aerohive Networks Inc                     AEP Networks          1       1   .53935989 |
           9103. |         Aerohive Networks Inc                 AGN Networks Inc          1       1   .69282032 |
           9840. |     Agilent Technologies Inc.             ABC Technologies Inc          1       1   .76486616 |
           9869. |     Agilent Technologies Inc.         ADM Technologies Pvt Ltd          2       1   .61339562 |
           9886. |     Agilent Technologies Inc.             AEV Technologies Inc          2       1   .76486616 |
           9894. |     Agilent Technologies Inc.           AGEIA Technologies Inc          2       1   .72753284 |
           9897. |     Agilent Technologies Inc.    AGS Transact Technologies Ltd          1       1    .5728919 |
                 +-----------------------------------------------------------------------------------------------+
          
          .
          . list if score>0.75, sep(0)
          
                 +----------------------------------------------------------------------------------+
                 |                  firm_name              firmname2   allian~t   _freq       score |
                 |----------------------------------------------------------------------------------|
           732. | 6D Global Technologies Inc   ABC Technologies Inc          1       1   .75056834 |
            778. | 6D Global Technologies Inc   AEV Technologies Inc          2       1   .75056834 |
           1102. |           A10 Networks Inc       A10 Networks Inc          1       1           1 |
           1183. |           A10 Networks Inc       AGN Networks Inc          1       1          .8 |
          1313. |            AAR Corporation               AAR Corp          6       1   .75592895 |
           1426. |             ABM Industries     ABM Industries Inc          3       1    .9078413 |
           1538. |          ACI Worldwide Inc      ACI Worldwide Inc          3       1           1 |
           1656. |                   ADT Inc.                ADT Inc          1       1    .9258201 |
           1756. |                ADTRAN Inc.             ADTRAN Inc          2       1    .9486833 |
          1921. |        AEP Industries Inc.     ABM Industries Inc          3       1   .83743579 |
          2369. |           AGCO Corporation              AGCO Corp          3       1   .77174363 |
          2467. |         AGL Resources Inc.      AGA Resources Inc          1       1   .86518091 |
           9840. |  Agilent Technologies Inc.   ABC Technologies Inc          1       1   .76486616 |
           9886. |  Agilent Technologies Inc.   AEV Technologies Inc          2       1   .76486616 |
                 +----------------------------------------------------------------------------------+
          
          .
          . list if score>0.9, sep(0)
          
                 +----------------------------------------------------------------------+
                 |         firm_name            firmname2   allian~t   _freq      score |
                 |----------------------------------------------------------------------|
           1102. |  A10 Networks Inc     A10 Networks Inc          1       1          1 |
           1426. |    ABM Industries   ABM Industries Inc          3       1   .9078413 |
           1538. | ACI Worldwide Inc    ACI Worldwide Inc          3       1          1 |
           1656. |          ADT Inc.              ADT Inc          1       1   .9258201 |
           1756. |       ADTRAN Inc.           ADTRAN Inc          2       1   .9486833 |
                 +----------------------------------------------------------------------+
          
          .

          So this illustrates that some manual input will be needed despite the fuzzy matching machinery.
          Last edited by Andrew Musau; 11 May 2022, 09:07.

          Comment


          • #20
            Dear Andrew,

            In the section *replace firm names with matches"
            you specified the code bys 'r(varlist)' , which gives an error. What should I add here ?
            I include a dataex snippet of how usingfile2 looks like when using after 'ds oldnames, not'


            ----------------------- copy starting from the next line -----------------------
            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input str30 firm_name float alliance_count str46 firm_name_3 str30 oldnames
            "01 Sys"                         1 "2C media LLC."    "01 Sys"                        
            "1 Ltd"                          1 ""                 "1 Ltd"                         
            "1&1 Ionos SE"                   1 ""                 "1&1 Ionos SE"                  
            "1-800-Flowers.com Inc"          2 ""                 "1-800-Flowers.com Inc"         
            "10-4 Systems Inc"               1 "2C media LLC."    "10-4 Systems Inc"              
            "1000watt LLC"                   1 ""                 "1000watt LLC"                  
            "100tv.com"                      1 "2C media LLC."    "100tv.com"                     
            "10art-ni Corp"                  1 ""                 "10art-ni Corp"                 
            "10zig Technology Inc"           2 "3M Corporation"   "10zig Technology Inc"          
            "123ID Inc"                      2 "2C media LLC."    "123ID Inc"                     
            "128 Technology Inc"             1 ""                 "128 Technology Inc"            
            "170 Systems Inc"                1 "3D Systems Corp." "170 Systems Inc"               
            "1776"                           1 "2C media LLC."    "1776"                          
            "180bytwo LLC"                   1 "2C media LLC."    "180bytwo LLC"                  
            "1901 Group LLC"                 1 "2C media LLC."    "1901 Group LLC"                
            "1933 Industries Inc"            1 ""                 "1933 Industries Inc"           
            "1EDISource Inc"                 1 "2C media LLC."    "1EDISource Inc"                
            "1LINK (Guarantee) Ltd"          1 "2C media LLC."    "1LINK (Guarantee) Ltd"         
            "1TouchSoftware Solutions Inc"   1 "3M Corporation"   "1TouchSoftware Solutions Inc"  
            "1Verge Info Tech (Beijing) Co"  1 "3M Corporation"   "1Verge Info Tech (Beijing) Co" 
            "1WorldSync Holdings Inc"        1 ""                 "1WorldSync Holdings Inc"       
            "1nce GmbH"                      1 "2C media LLC."    "1nce GmbH"                     
            "1oT OU"                         1 ""                 "1oT OU"                        
            "1seo.Com"                       1 "2C media LLC."    "1seo.Com"                      
            "1st Group Ltd"                  1 "3M Corporation"   "1st Group Ltd"                 
            "1stpoint Communications LLC"    1 "2C media LLC."    "1stpoint Communications LLC"   
            "1world Online Inc"              1 "2C media LLC."    "1world Online Inc"             
            "20-20 Technologies Inc"         1 "3D Systems Corp." "20-20 Technologies Inc"        
            "2020 Advisors LLC"              1 ""                 "2020 Advisors LLC"             
            "2021.ai ApS"                    2 "3D Systems Corp." "2021.ai ApS"                   
            "21LADY Co Ltd"                  1 "2C media LLC."    "21LADY Co Ltd"                 
            "21Vianet Group Inc"             3 "3M Corporation"   "21Vianet Group Inc"            
            "21st Century Fox Inc"           1 "3D Systems Corp." "21st Century Fox Inc"          
            "21st Century Technologies Ltd"  2 ""                 "21st Century Technologies Ltd" 
            "24/7 Customer"                  1 "3M Corporation"   "24/7 Customer"                 
            "24/7 Customer Inc"              1 ""                 "24/7 Customer Inc"             
            "24/7 Real Media Inc"            2 "2C media LLC."    "24/7 Real Media Inc"           
            "24by7sec Inc"                   1 "2C media LLC."    "24by7sec Inc"                  
            "2618249 Ontario Corp"           6 "2C media LLC."    "2618249 Ontario Corp"          
            "2B Wireless Inc"                1 "2C media LLC."    "2B Wireless Inc"               
            "2C Media"                       1 ""                 "2C Media"                      
            "2CRSI SA"                       2 "3D Systems Corp." "2CRSI SA"                      
            "2Hz Inc"                        1 "3D Systems Corp." "2Hz Inc"                       
            "2bPrecise LLC"                  1 "3M Corporation"   "2bPrecise LLC"                 
            "2bcreative entertainment"       1 ""                 "2bcreative entertainment"      
            "2ergo Group PLC"                1 ""                 "2ergo Group PLC"               
            "360buy Jingdong Mall"           1 "2C media LLC."    "360buy Jingdong Mall"          
            "360factors Inc"                 1 "2C media LLC."    "360factors Inc"                
            "3CInteractive Corp"             1 "3M Corporation"   "3CInteractive Corp"            
            "3Com Corp"                      1 "3M Corporation"   "3Com Corp"                     
            "3Com Korea"                     2 "3M Corporation"   "3Com Korea"                    
            "3D Eye Solutions Inc"           1 "3D Systems Corp." "3D Eye Solutions Inc"          
            "3D Printing Industry"           1 "3D Systems Corp." "3D Printing Industry"          
            "3D Results"                     1 "2C media LLC."    "3D Results"                    
            "3D Robotic Inc"                 1 "3D Systems Corp." "3D Robotic Inc"                
            "3D Systems Corp"                6 "3D Systems Corp." "3D Systems Corp"               
            "3DR Laboratories LLC"           1 "2C media LLC."    "3DR Laboratories LLC"          
            "3E Co Environmental Ecological" 1 ""                 "3E Co Environmental Ecological"
            "3GTV"                           1 "2C media LLC."    "3GTV"                          
            "3M Co"                          3 "2C media LLC."    "3M Co"                         
            "3P Networks Inc"                1 "3M Corporation"   "3P Networks Inc"               
            "3RD Ring"                       1 ""                 "3RD Ring"                      
            "3V Transaction Services Ltd"    1 "3M Corporation"   "3V Transaction Services Ltd"   
            "3VR Security Inc"               1 "3M Corporation"   "3VR Security Inc"              
            "3e Technologies Intl Inc"       1 "3M Corporation"   "3e Technologies Intl Inc"      
            "3eTI"                           1 "2C media LLC."    "3eTI"                          
            "3i Infotech Ltd"                2 "2C media LLC."    "3i Infotech Ltd"               
            "3n"                             1 "2C media LLC."    "3n"                            
            "41st Parameter Inc"             1 "3D Systems Corp." "41st Parameter Inc"            
            "42crunch"                       1 "3D Systems Corp." "42crunch"                      
            "482.Solutions"                  1 "3M Corporation"   "482.Solutions"                 
            "4C Insights Inc"                1 "3M Corporation"   "4C Insights Inc"               
            "4Home Inc"                      1 "3D Systems Corp." "4Home Inc"                     
            "4INFO Inc"                      1 "3M Corporation"   "4INFO Inc"                     
            "4Mobility SA"                   1 "3M Corporation"   "4Mobility SA"                  
            "4Voice LLC"                     1 "3D Systems Corp." "4Voice LLC"                    
            "4th Screen Advertising Ltd"     1 ""                 "4th Screen Advertising Ltd"    
            "51Job Inc"                      2 "3M Corporation"   "51Job Inc"                     
            "5LINX Enterprises Inc"          1 "2C media LLC."    "5LINX Enterprises Inc"         
            "6 Over 6 Vision Ltd"            1 "3D Systems Corp." "6 Over 6 Vision Ltd"           
            "631 Success Llc"                1 "3M Corporation"   "631 Success Llc"               
            "6788289 Canada Inc"             1 "3D Systems Corp." "6788289 Canada Inc"            
            "6Wind SA"                       2 "3D Systems Corp." "6Wind SA"                      
            "6fusion USA Inc"                1 "2C media LLC."    "6fusion USA Inc"               
            "7-Eleven Inc"                   1 "2C media LLC."    "7-Eleven Inc"                  
            "701Search Pte Ltd"              2 ""                 "701Search Pte Ltd"             
            "777Online"                      1 ""                 "777Online"                     
            "77Agenc Ltd"                    1 ""                 "77Agenc Ltd"                   
            "797738 Ontario Ltd"             1 "2C media LLC."    "797738 Ontario Ltd"            
            "7Seas Technologies Ltd"         1 "2C media LLC."    "7Seas Technologies Ltd"        
            "7h Hldg"                        1 "3D Systems Corp." "7h Hldg"                       
            "7starlake Co Ltd"               1 "3M Corporation"   "7starlake Co Ltd"              
            "80 Acres Farm"                  1 "3M Corporation"   "80 Acres Farm"                 
            "888 Holdings PLC"               2 ""                 "888 Holdings PLC"              
            "888voip"                        1 "2C media LLC."    "888voip"                       
            "8digital"                       1 ""                 "8digital"                      
            "8i Holdings Ltd"                1 "3D Systems Corp." "8i Holdings Ltd"               
            "8x8 Inc"                        6 "3M Corporation"   "8x8 Inc"                       
            "99 Wuxian Ltd"                  1 "3M Corporation"   "99 Wuxian Ltd"                 
            "999 Call Center Corp"           1 "3M Corporation"   "999 Call Center Corp"          
            end
            ------------------ copy up to and including the previous line ------------------

            Listed 100 out of 18582 observations
            Use the count() option to list more




            Comment


            • #21
              What error message do you get? Copy and paste it exactly. The code is meant to drop perfect duplicates, but in your attached dataset in #20, there are none which is fine. This is what I get:

              Code:
              . ds oldnames, not
              firm_name     alliance_c~t  firm_name_3
              
              . bys `r(varlist)': keep if _n==1
              (0 observations deleted)
              Also, why do you have a variable named firm_name_3 in the above output? My codes do not generate such a variable.

              Comment


              • #22
                . bys r(varlist): keep if _n==1
                variable r not found
                r(111);

                Comment


                • #23
                  It's a macro denoted `r(varlist)' and not r(varlist). Pay attention to the left and right single quotes.

                  Comment


                  • #24
                    when I perform your codes specified above in section *NOW MERGE WITH MASTER FILE
                    when merging the data I get the following error : . merge m:1 firm_name using oefenusingfile1, keep(master match) nogen
                    variable firm_name does not uniquely identify observations in the using data
                    r(459);

                    end of do-file

                    (oefen is the direct translation for practice; I created this because running the code for the real dataset takes 12 hours)

                    Comment


                    • #25
                      You need to check why you have duplicates in the using file.

                      Code:
                      use oefenusingfile1, clear
                      sort firm_name
                      bys firm_name: gen tag =_N>1
                      browse if tag
                      If and only if the duplicates do not contain any extra (useful) information, you can

                      Code:
                      drop tag
                      bys firm_name: keep if _n==1
                      save oefenusingfile1, replace
                      before merging.
                      Last edited by Andrew Musau; 16 May 2022, 03:28.

                      Comment


                      • #26
                        It worked!!!! thank you so much andrew! Could not have done it without your help

                        Comment

                        Working...
                        X