Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Comparing database

    Hi there,
    I am in new in Stata and looking for a code to compare database. I want to compare between "supplier" and "customer" file to find any relationship appears in the customer file but not in the supplier file. I need to rely on target and source companies. The outcome of such comparison is if a pair of target and source companies appears in the customer file but not in the supplier file, then the pair will need to be brought over to the supplier file. Any one can please help me to get the code?

    Thanks,
    Rakib
    Last edited by Rakibul Hasan; 12 Dec 2022, 06:41.

  • #2
    Without example data to work with, there is a lot of guesswork about your data set that must be done to come up with code. And if somebody guesses wrong, your time and theirs will have been wasted. Please post back showing example data from both the supplier and customer data sets. Use the -dataex- command to do that. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Hi Clyde,
      Thanks for the advice. I am using such blog for the first time, so had no idea. Here is the example dataset using dataex command. Customer dataset has more number of observations than supplier dataset. The number of variables of both dataset are same.

      Customer Database:
      input long(start_ end_) str29 id str15 rel_type long(source_company_id target_company_id) str12(source_isin target_isin) strL(source_company_keyword target_company_keyword)
      20180524 20211229 "CUST-100000247" "CUSTOMER" 194505765 21997399 "SG1Y44946602" "ID1000111305" "Seroja Investments" "Maritim Barito Perkasa"
      20161103 20180524 "CUST-100000247" "CUSTOMER" 194505765 21997399 "SG1Y44946602" "ID1000111305" "Seroja Investments" "Maritim Barito Perkasa"
      20161103 20180524 "CUST-100000336" "CUSTOMER" 194505765 7267252 "SG1Y44946602" "" "Seroja Investments" "Kideco Jaya Agung"
      20161103 20211229 "CUST-100000500" "CUSTOMER" 194505765 21998335 "SG1Y44946602" "ID1000061302" "Seroja Investments" "Indocement"
      20161103 20171127 "CUST-100001093" "CUSTOMER" 22443458 3769 "NGOANDO00002" "US30231G1022" "Oando" "MPNU"
      20161103 20190115 "CUST-100001232" "CUSTOMER" 194527177 22450335 "INE522F01014" "INE081A01020" "Central Mine Planning & Design Institute Limited" "Tata Steel"
      20161103 20190821 "CUST-100001451" "CUSTOMER" 194527177 22434725 "INE522F01014" "INE584A01023" "Central Mine Planning & Design Institute Limited" "NMDC"
      20161103 20171117 "CUST-100001499" "CUSTOMER" 194527177 7022255 "INE522F01014" "" "Central Mine Planning & Design Institute Limited" ""
      20161103 20180115 "CUST-100001895" "CUSTOMER" 22447908 5758 "" "US4781601046" "Dr Ci Labo" "Johnson & Johnson"
      20161103 20190821 "CUST-100002099" "CUSTOMER" 194527177 33501820 "INE522F01014" "" "Central Mine Planning & Design Institute Limited" "Gujarat State Electricity Corporation Limited"
      20171127 20181030 "CUST-100002127" "CUSTOMER" 22443458 6657741 "NGOANDO00002" "" "OG&P" "Bua Group"
      20161103 20171127 "CUST-100002127" "CUSTOMER" 22443458 6657741 "NGOANDO00002" "" "OG&P" "Bua Group"
      20161103 20171117 "CUST-100002150" "CUSTOMER" 194527177 9554028 "INE522F01014" "" "Central Mine Planning & Design Institute Limited" "M. P. Power Generating"
      20161103 20190115 "CUST-100002227" "CUSTOMER" 194527177 9403960 "INE522F01014" "" "Central Mine Planning & Design Institute Limited" ""
      20161103 20171013 "CUST-100004171" "CUSTOMER" 7060 100004088 "US6304021057" "" "Napco" "Card Lock Company"
      20161103 20171013 "CUST-100004446" "CUSTOMER" 7060 100004331 "US6304021057" "" "Napco" "DWG Distribution"
      20161103 20171013 "CUST-100004626" "CUSTOMER" 7060 100004566 "US6304021057" "" "Napco" "DPC Distributors"
      20161103 20200724 "CUST-100004907" "CUSTOMER" 1655091 230298 "DK0010218429" "KR7066570003" "Bang & Olufsen" "LG Electronics"
      20161103 20171013 "CUST-100005274" "CUSTOMER" 7060 100005218 "US6304021057" "" "Napco" "Intermountain Lock & Security Supply"
      20161103 20171013 "CUST-100005379" "CUSTOMER" 7060 100005321 "US6304021057" "" "Napco" "Custom Electronic Supply"
      20161103 20180131 "CUST-100005611" "CUSTOMER" 75788601 9622611 "AU000000QMS6" "" "Ambient Advertising NZ" "Auckland Transport"
      20161103 20171013 "CUST-100005900" "CUSTOMER" 7060 100005863 "US6304021057" "" "Napco" "eDist Corporate"
      20161103 20171013 "CUST-100005960" "CUSTOMER" 7060 100005943 "US6304021057" "" "Napco" "MACO Security Products"
      20161103 20181127 "CUST-100006131" "CUSTOMER" 75788601 194534876 "AU000000QMS6" "NZFCGE0001S7" "qms media" "Fonterra"
      20161103 20171013 "CUST-100006208" "CUSTOMER" 7060 100006182 "US6304021057" "" "Napco" "Saf-Com Supply"
      20161103 20170928 "CUST-100006225" "CUSTOMER" 1655091 4158 "DK0010218429" "US3453708600" "Bang & Olufsen" "Ford"
      20161103 20181127 "CUST-100006256" "CUSTOMER" 75788601 22451610 "AU000000QMS6" "DK0010181759" "qms media" "Carlsberg"
      20161103 20171013 "CUST-100006273" "CUSTOMER" 7060 739 "US6304021057" "US0352901054" "Napco" "Tri-Ed"
      20161103 20181127 "CUST-100006359" "CUSTOMER" 75788601 194529595 "AU000000QMS6" "NZTMEE0003S8" "qms media" "Trade Me Group"
      20161103 20181127 "CUST-100006541" "CUSTOMER" 75788601 67706748 "AU000000QMS6" "US40434L1052" "qms media" "HP"



      Supplier Database:

      input long(start_ end_) str29 id str15 rel_type long(source_company_id target_company_id) str12(source_isin target_isin) strL(source_company_keyword target_company_keyword)
      20161103 20171102 "SUPPL-100000171" "SUPPLIER" 194503926 9366143 "MYL1503OO001" "" "Guoco" "Accor"
      20161103 20180115 "SUPPL-100001674" "SUPPLIER" 22447908 5758 "" "US4781601046" "Dr Ci Labo" "Cilag"
      20161103 20171219 "SUPPL-100001696" "SUPPLIER" 22443458 8448341 "NGOANDO00002" "NGSEPLAT0008" "Oando" "seplat petroleum development"
      20161103 20211203 "SUPPL-100002304" "SUPPLIER" 22438476 194499278 "GH0000000102" "NGGUINNESS07" "Guinness Ghana Breweries" "Guinness* (Nigeria)"
      20161103 20190208 "SUPPL-100002463" "SUPPLIER" 22438476 3139 "GH0000000102" "GB0002374006" "Guinness Ghana Breweries" "Diageo Ireland"
      20211203 20220808 "SUPPL-100002463" "SUPPLIER" 22438476 3139 "GH0000000102" "GB0002374006" "Guinness Ghana Breweries" "Diageo Ireland"
      20190208 20191213 "SUPPL-100002463" "SUPPLIER" 22438476 3139 "GH0000000102" "GB0002374006" "Guinness Ghana Breweries" "Diageo Ireland"
      20180928 20191029 "SUPPL-100004507" "SUPPLIER" 21070538 1807560 "CLP2577D1038" "CLP7847L1080" "Cia Electro Metalurgica" "Abastible"
      20180928 20191029 "SUPPL-100004507" "SUPPLIER" 21070538 1807560 "CLP2577D1038" "CLP7847L1080" "Cia Electro Metalurgica" "Abastible"
      20161103 20180928 "SUPPL-100004507" "SUPPLIER" 21070538 1807560 "CLP2577D1038" "CLP7847L1080" "Cia Electro Metalurgica" "Abastible"
      20161103 20181211 "SUPPL-100004936" "SUPPLIER" 21070538 1517885 "CLP2577D1038" "DE0006483001" "Cia Electro Metalurgica" "Linde Gas Chile"
      20161103 20170928 "SUPPL-100005232" "SUPPLIER" 1655091 9521914 "DK0010218429" "" "Bang & Olufsen" "Kvadrat A/S"
      20161103 20170428 "SUPPL-100007666" "SUPPLIER" 21070538 194512799 "CLP2577D1038" "" "Cia Electro Metalurgica" "Chilectra"
      20161103 20191029 "SUPPL-100007684" "SUPPLIER" 21070538 21070823 "CLP2577D1038" "CL0002694637" "Cia Electro Metalurgica" "AES Gener"
      20161103 20180726 "SUPPL-100007907" "SUPPLIER" 21070538 1612425 "CLP2577D1038" "" "Cia Electro Metalurgica" "CODELCO"
      20161103 20191029 "SUPPL-100008501" "SUPPLIER" 21070538 279 "CLP2577D1038" "US0091581068" "Cia Electro Metalurgica" "Indura"
      20161103 20180726 "SUPPL-100008529" "SUPPLIER" 21070538 21742215 "CLP2577D1038" "" "Cia Electro Metalurgica" "IMF"
      20161103 20191029 "SUPPL-100008537" "SUPPLIER" 21070538 21742195 "CLP2577D1038" "" "Cia Electro Metalurgica" "Dipromet"
      20161103 20191029 "SUPPL-100008586" "SUPPLIER" 21070538 6015824 "CLP2577D1038" "" "Cia Electro Metalurgica" "London Metals"
      20161103 20191029 "SUPPL-100008729" "SUPPLIER" 21070538 21742318 "CLP2577D1038" "" "Cia Electro Metalurgica" "PIMASA"
      20161103 20191029 "SUPPL-100008877" "SUPPLIER" 21070538 6709 "CLP2577D1038" "ARP6558L1178" "Cia Electro Metalurgica" "MetroGAS"
      20161103 20171031 "SUPPL-100008895" "SUPPLIER" 21070538 194505091 "CLP2577D1038" "" "Cia Electro Metalurgica" "RHI Refractories"
      20161103 20180122 "SUPPL-100009095" "SUPPLIER" 194516202 484 "GB0009186429" "" "Finsbury Food Group" "Warner Bros."
      20161103 20180619 "SUPPL-100009095" "SUPPLIER" 194516202 484 "GB0009186429" "" "Finsbury Food Group" "Warner Bros."

      Comment


      • #4
        Thank you. I have a much clearer idea of how to do this now. But I'm afraid the data raises some new questions that I cannot answer because I don't know what some of these variables mean, nor exactly how you intend to use them.
        1. Since the values of source_isin and target_isin are sometimes missing, I assume that the identity of a source and target must be taken from the respective *_company_id variables. Correct?
        2. Sometimes, the same source_company_id target_company_id appears more than once (in either dataset), but associated with different values of the variables start_ and end_. Should such situations be considered to be a single source-target "relationship" or should they be considered separate "relationships." For example source_id 22443458 and target_id 6657741 appear together twice in the customer data, once with start = 20161103 and end = 201771127, and again with start = 20171127 and end = 20181030.
        3. More generally, suppose that a source_company_id target_company_id appears in the customer data base and also in the supplier data base but the values of start_ and end_ are different in the two data sets, does the version in the customer data base belong in the output as not appearing in the supplier data set, or can we ignore the start_ and end_ differences. (In your example, this happens for the pair source_company_id = 21070538 and target_company_id = 1807560.)
        4. In the suppliers example data, the 7th and 8th observations are exact duplicates of each other. (That is, they agree on all variables shown.) Is this true in the full data set, or are there other variables that distinguish them in the full data set. If there are such variables, I will need a new data set that contains them so that I can distinguish the two observations. If there are no such variables, it suggests that errors were made in putting this data set together, as it is rare that a correct data set contains exactly duplicate observations. Please review how this data set was created and fix any errors that may have led to this anomaly. If the data set is, in fact, correct, and is supposed to contain exact duplicates, please explain how to use the duplicate information. For example, if the customer data set contains two identical observations (this does not actually happen in the example shown), and the supplier data set contained only one that matched it, should the second copy from the customer data set be included in the output as not occurring in the supplier data. Or should complete duplicates be ignored?
        5. start_ and end_ look like they are dates, except that they are not properly represented as date variables for Stata. Are they, in fact, supposed to be dates?

        Comment


        • #5
          Thanks lot for raising the questions. It opened my eyes to clean the dataset again. I cleaned the master file buy surprisingly had to run the same code again. Please find the new dataset below and answers of your questions:
          1. Yes, you are correct
          2. I have merged the rows again to eliminate this issue. There may still exists id source_company_id target_company_id etc (it is due to subsidiary or other variables not mentioned in the list) that appears more than once. Therefore, it will be considered as separate “relationship”
          3. we can ignore the start_ and end_ differences
          4. I have eliminated the duplicates. There may still exists same id source_company_id target_company_id etc but it is due to subsidiary or other variables not mentioned in the list.
          5. start_ and end_ are dates. It means when the company enters and end their relationship.
          Thanks again for your help.

          Suppliers:
          input long(start_ end_) str29 id str15 rel_type long(source_company_id target_company_id) str12(source_isin target_isin) strL(source_company_keyword target_company_keyword)
          20211203 20220808 "SUPPL-100002463" "SUPPLIER" 22438476 3139 "GH0000000102" "GB0002374006" "Guinness Ghana Breweries" "Diageo Ireland"
          20190208 20191213 "SUPPL-100002463" "SUPPLIER" 22438476 3139 "GH0000000102" "GB0002374006" "Guinness Ghana Breweries" "Diageo Ireland"
          20180928 20191029 "SUPPL-100004507" "SUPPLIER" 21070538 1807560 "CLP2577D1038" "CLP7847L1080" "Cia Electro Metalurgica" "Abastible"
          20161103 20180928 "SUPPL-100004507" "SUPPLIER" 21070538 1807560 "CLP2577D1038" "CLP7847L1080" "Cia Electro Metalurgica" "Abastible"
          20161103 20180619 "SUPPL-100009095" "SUPPLIER" 194516202 484 "GB0009186429" "" "Finsbury Food Group" "Warner Bros."
          20161103 20171215 "SUPPL-100016344" "SUPPLIER" 194500869 86521585 "PK0024501014" "" "Attock Refinery" "Attock Information Technology Services"
          20161103 20180726 "SUPPL-100016626" "SUPPLIER" 194501087 22435718 "IL0002580129" "JP3190000004" "Tadiran Holdings" "Toshiba Air conditioning"
          20161103 20200715 "SUPPL-100028912" "SUPPLIER" 194526723 99845705 "VN000000DNM6" "" "DANAMECO Medical" "Sonoscanner"
          20161104 20180215 "SUPPL-100046509" "SUPPLIER" 1880950 1319 "KYG463AM1037" "US0846707026" "Hop Hing" "Dairy Queen"
          20181128 20191106 "SUPPL-100055188" "SUPPLIER" 6695190 194528961 "AU000000BAL8" "AU000000BGA8" "bellamys australia" "Tatura Milk Industries"
          20161104 20190711 "SUPPL-100066048" "SUPPLIER" 1893093 1792200 "KYG6141X1079" "DE0007657231" "Ming Fai International" "Villeroy & Boch"
          20161104 20191230 "SUPPL-100068979" "SUPPLIER" 127318 75653840 "CNE100003NL9" "" "Bloomage BioTechnology" "Laboratoires Vivacy"
          20161104 20190711 "SUPPL-100073260" "SUPPLIER" 1893093 19655394 "KYG6141X1079" "" "Ming Fai International" "Jeanne Lanvin"
          20190607 20200505 "SUPPL-100090411" "SUPPLIER" 22445384 6068620 "KR7002020006" "US24703L2025" "Kolon Benit" "Dell Technologies"
          20161104 20170516 "SUPPL-100090472" "SUPPLIER" 194506005 6068620 "TH0376010Z04" "US24703L2025" "Loxley" "Dell Technologies"
          20161104 20201029 "SUPPL-100090494" "SUPPLIER" 7864 6068620 "US7043261079" "US24703L2025" "Paychex" "Dell EMC"
          20161104 20180323 "SUPPL-100090496" "SUPPLIER" 7870 6068620 "US69318J1007" "US24703L2025" "PC Connection" "EMC"
          20180305 40000101 "SUPPL-100090516" "SUPPLIER" 9297012 6068620 "BMG9019D1048" "US24703L2025" "Travelport" "EMC Corp"
          20161104 40000101 "SUPPL-100090595" "SUPPLIER" 17049 6068620 "GB0022569080" "US24703L2025" "Amdocs" "Dell EMC"
          20161104 20210224 "SUPPL-100090619" "SUPPLIER" 2089 6068620 "US1567821046" "US24703L2025" "Cerner" "Dell EMC"
          20170306 20190306 "SUPPL-100090621" "SUPPLIER" 2254 6068620 "US1718715022" "US24703L2025" "Cincinnati Bell" "Dell Technologies"
          20161104 20170306 "SUPPL-100090621" "SUPPLIER" 2254 6068620 "US1718715022" "US24703L2025" "Cincinnati Bell" "Dell Technologies"
          20161104 20170609 "SUPPL-100090629" "SUPPLIER" 1625707 6068620 "GB00BV9FP302" "US24703L2025" "Computacenter" "RSA Security"
          20161104 20190712 "SUPPL-100090631" "SUPPLIER" 1625707 6068620 "GB00BV9FP302" "US24703L2025" "Computacenter" "VCE"
          20161104 20161107 "SUPPL-100090635" "SUPPLIER" 194513391 6068620 "AU000000DTL4" "US24703L2025" "Data3" "EMC"
          20161104 20190710 "SUPPL-100090657" "SUPPLIER" 194512681 6068620 "KYG369921088" "US24703L2025" "Futong Technology Development" "EMC"
          20161104 20170905 "SUPPL-100090669" "SUPPLIER" 22442146 6068620 "JP3217200009" "US24703L2025" "Kanematsu Electronics" "EMC"



          Customers:

          input long(start_ end_) str29 id str15 rel_type long(source_company_id target_company_id) str12(source_isin target_isin) strL(source_company_keyword target_company_keyword)
          20161103 20180524 "CUST-100000247" "CUSTOMER" 194505765 21997399 "SG1Y44946602" "ID1000111305" "Seroja Investments" "Maritim Barito Perkasa"
          20161103 20171127 "CUST-100002127" "CUSTOMER" 22443458 6657741 "NGOANDO00002" "" "OG&P" "Bua Group"
          20161103 20180728 "CUST-100006986" "CUSTOMER" 75788601 15402 "AU000000QMS6" "" "qms media" "Transport Accident Commission"
          20161103 20170922 "CUST-100009372" "CUSTOMER" 3087 100009044 "US2477481061" "" "Delta Natural Gas" "Greystone"
          20161103 20181206 "CUST-100009646" "CUSTOMER" 1919323 241907 "VGG456671053" "JP3900000005" "HollySys Automation Technologies" "Mitsubishi Heavy Industries"
          20060531 20070530 "CUST-10001-1604308" "CUSTOMER" 10001 1604308 "US8790801091" "" "" ""
          20051109 20060531 "CUST-10001-1604308" "CUSTOMER" 10001 1604308 "US8790801091" "" "" ""
          20180202 20181127 "CUST-100010247" "CUSTOMER" 75788601 1835724 "AU000000QMS6" "KR7005930003" "qms media" "Samsung Electronics"
          20161103 20180131 "CUST-100012134" "CUSTOMER" 75788601 4158 "AU000000QMS6" "US3453708600" "qms media" "Ford"
          20161103 20161115 "CUST-100012244" "CUSTOMER" 7954 15532 "US71361F1003" "JP3657400002" "Perceptron" "Nikon Metrology"
          20180110 20200929 "CUST-100012266" "CUSTOMER" 7954 16011792 "US71361F1003" "" "Perceptron" "Zeiss"
          20161115 20180110 "CUST-100012266" "CUSTOMER" 7954 16011792 "US71361F1003" "" "Perceptron" "Zeiss"
          20180202 20181127 "CUST-100012641" "CUSTOMER" 75788601 10916 "AU000000QMS6" "GB00BH4HKS39" "qms media" "Vodafone"
          20161103 20161117 "CUST-100013457" "CUSTOMER" 22246107 93427701 "AU000000EGO3" "US4432011082" "Empire Oil & Gas" "Alcoa of Australia"
          20161103 20200930 "CUST-100014673" "CUSTOMER" 22434273 22433940 "INE059B01024" "INE019A01038" "Simplex Infrastructures" "JSW Ispat Steel"
          20161103 20180119 "CUST-100016375" "CUSTOMER" 1919323 1880745 "VGG456671053" "CNE1000002R0" "HollySys Automation Technologies" "Shenhua Information"
          20161103 20181212 "CUST-100016755" "CUSTOMER" 194500869 86521585 "PK0024501014" "" "Attock Refinery" "Attock Information Technology Services"
          20050608 20050712 "CUST-10002-1253" "CUSTOMER" 10002 1253 "" "CA05534B7604" "" ""
          20050608 20060622 "CUST-10002-1290" "CUSTOMER" 10002 1290 "" "" "" ""
          20040519 20070531 "CUST-10002-1418691" "CUSTOMER" 10002 1418691 "" "" "" ""
          20060622 20080416 "CUST-10002-317" "CUSTOMER" 10002 317 "" "FR0000130007" "" ""
          20030403 20050712 "CUST-10002-4205" "CUSTOMER" 10002 4205 "" "FR0000133308" "" ""
          20161103 20171027 "CUST-100020273" "CUSTOMER" 22450519 1478913 "INE767A01016" "FR0000120578" "Aarti Drugs" "Sanofi-Aventis"
          20161103 20191018 "CUST-100024822" "CUSTOMER" 16049343 218152 "GB00BKF1YD83" "GB00BLGZ9862" "eagle eye solutions group plc" "Tesco PLC"
          20161103 20201023 "CUST-100024822" "CUSTOMER" 16049343 218152 "GB00BKF1YD83" "GB00BLGZ9862" "eagle eye solutions group plc" "Tesco PLC"
          20050712 20080401 "CUST-10004-20685" "CUSTOMER" 10004 20685 "US8789192080" "" "Teknowledge" ""
          20050504 20060602 "CUST-10004-5123" "CUSTOMER" 10004 5123 "US8789192080" "GB0005405286" "Teknowledge" ""
          20050712 20060602 "CUST-10004-6413" "CUSTOMER" 10004 6413 "US8789192080" "" "" ""
          20050504 20050712 "CUST-10004-6413" "CUSTOMER" 10004 6413 "US8789192080" "" "" ""
          20050504 20060602 "CUST-10004-8992" "CUSTOMER" 10004 8992 "US8789192080" "" "" ""
          20030518 20050712 "CUST-10004-9855" "CUSTOMER" 10004 9855 "US8789192080" "" "" ""
          20161104 20200925 "CUST-100048599" "CUSTOMER" 1922595 9612493 "FR0014004QZ9" "" "Dietswell Engineering" "GSPS"
          20060909 20071120 "CUST-10005-10916" "CUSTOMER" 10005 10916 "" "GB00BH4HKS39" "Tektronix" "Vodafone"
          20041006 20050928 "CUST-10005-1602" "CUSTOMER" 10005 1602 "" "GB0030913577" "Tektronix" "BT"
          20060909 20070925 "CUST-10005-1602" "CUSTOMER" 10005 1602 "" "GB0030913577" "Tektronix" "BT"

          Comment


          • #6
            OK, so this should do it:
            Code:
            use customer_data_set, clear
            by source_company_id target_company_id (start_ end_), sort: gen seq = _n
            tempfile holding
            save `holding'
            
            use supplier_data_set, clear
            by source_company_id target_company_id (start_ end_), sort: gen seq = _n
            merge 1:1 source_company_id target_company_id seq using `holding'
            
            //  AT THIS POINT THE DATA IN MEMORY IS AN UPDATED SUPPLIERS DATA SET THAT INCLUDES
            //  ANY NEW RELATIONSHIPS FOUND IN THE CUSTOMERS DATA SET BUT NOT ORIGINALLY IN
            //  THE SUPPLIERS DATA SET.  SO YOU MIGHT WANT TO SAVE IT.
            
            //  TO NOW REDUCE IT TO JUST THOSE OBSERVATIONS WHICH WERE ADDED TO THE SUPPLIERS
            //  DATA SET RUN:
            keep if _merge == 2
            I'm not sure what you mean by subsidiary variables "not mentioned in the list." If you mean that they are in your data set but you chose not include them in the data example, then I would say that you should skip the commands that create the seq variables, and in the -merge- command replace seq by the variable(s) that identifies(y) the subsidiaries instead. The purpose of the seq variable is just to ensure that any additional mentions of the same relationship were brought in to the supplier data base.

            Since we are not actually using the start_ and end_ variables here, I did not bother with any code to convert them to Stata internal format date variables. If you will be using these dates for any kind of calculations, you should do that yourself. Any attempted calculations with those variables in their current form will lead to trouble.
            Last edited by Clyde Schechter; 12 Dec 2022, 23:21.

            Comment


            • #7
              Perfectly worked. Thanks a lot.

              Comment

              Working...
              X