Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • merge command: How to do a merge with a date that is in-between two other dates from another dataset?

    Hi everyone,

    I need some help with the merging process. Basically, I have two datasets.
    (1) One includes the households ID, the contracts ID, and the starting and ending of each contract, in date %td format. Let's call it dataset1.

    (2) The other one is a monthly file with only households ID, and the date of electricity consumption, also in date %td format. Let's call it dataset2, for simplicity. Those datasets are monthly files that go from July 2021 to July 2023. For the moment, I just want to look in-depth at May and June 2021.
    Later, it will be necessary to extend this analysis to other months. The name of these files are always the same, i.e. "export_telemedida_yyyymm.dta"

    I want to merge as follows. I want to merge all the variables of the dataset1 into the dataset2:
    • If the date of electricity consumption of the dataset2 is in-between the start and the end of the contract, the merge should operate.
    Here are two dataex:

    Dataset1:


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long id double(date_contract_start date_contract_end) long idcontrato
    1001 18887 21700    1001
    1001 21701 22431  451697
    1001 22432 22645 1236132
    1001 22646 22676 1730454
    1001 22677 22735 2082075
    end
    format %td date_contract_start
    format %td date_contract_end
    Dataset2:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long id double date
    1001 22281
    1001 22282
    1001 22283
    1001 22284
    1001 22285
    end
    format %td date

    Could anyone please give me advices?
    I have been stuck for a while...

    Many thanks in advance.
    Best,

    Michael

    Last edited by Michael Duarte Goncalves; 16 Nov 2023, 07:32.

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long id double date
    1001 22281
    1001 22282
    1001 22283
    1001 22284
    1001 22285
    end
    format %td date
    tempfile using
    save `using'
    
    
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long id double(date_contract_start date_contract_end) long idcontrato
    1001 18887 21700    1001
    1001 21701 22431  451697
    1001 22432 22645 1236132
    1001 22646 22676 1730454
    1001 22677 22735 2082075
    end
    format %td date_contract_start
    format %td date_contract_end
    
    joinby id using `using'
    keep if inrange(date, date_contract_start, date_contract_end)
    Res.:

    Code:
    . l, sepby(id)
    
         +-----------------------------------------------------+
         |   id   date_co~t   date_co~d   idcont~o        date |
         |-----------------------------------------------------|
      1. | 1001   01jun2019   31may2021     451697   05jan2021 |
      2. | 1001   01jun2019   31may2021     451697   04jan2021 |
      3. | 1001   01jun2019   31may2021     451697   03jan2021 |
      4. | 1001   01jun2019   31may2021     451697   01jan2021 |
      5. | 1001   01jun2019   31may2021     451697   02jan2021 |
         +-----------------------------------------------------+

    Comment


    • #3
      Hi Andrew Musau,

      I tried and works nicely! Thank you so much for this beautiful suggestion.
      Lovely day.

      Best,
      Michael

      Comment

      Working...
      X