Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to perform a merge when the date variable in the "using" dataset is between two dates in the "master" dataset?

    Hi everyone,

    I would like to merge two dataset, but with a special condition:
    1. The master dataset contains the start and the end of contract, in date format
    2. The using dataset contains hourly consumption date, for each day of a given month.
    My question is:
    • How could I perform a merge in this case?
    • Is -rangejoin- better to use in these cases?
    Here is a small -dataex- from my master:


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long id double(date_contract_start date_contract_end) long idcontrato str5 tariff_ekon_id str13 product_classification
    1001 18887 21700 1001 "20A" "Clasico"
    1001 21701 22431 451697 "20DHA" "Clasico"
    1001 22432 22645 1236132 "20TD" "Clasico"
    1001 22646 22676 1730454 "20TD" "Clasico"
    1001 22677 22735 2082075 "20TD" "Clasico"
    end
    format %td date_contract_start
    format %td date_contract_end

    And here is a small -dataex- from my using for January 2021:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long id float(date_elec_consumption h_0 h_1 h_2 h_3 h_4 h_5 h_6 h_7 h_8 h_9 h_10 h_11 h_12 h_13 h_14 h_15 h_16 h_17 h_18 h_19 h_20 h_21 h_22 h_23)
    1001 22281 .179 .113 .149 .115 .179 .114 .148 .115 .179 .113 .146 .113 .171 .117 .137 .125 .159 .134 .125 .136 .137 .157 .113 .148
    1001 22282 .115 .181 .114 .149 .115 .168 .131 .127 .134 .125 .166 .111 .142  .11 .175 .114 .148 .113 .143 .138 .146 .112 .146 .113
    end
    format %td date_elec_consumption
    So, for ID "1001", I would like to add at the end of my using (after -h_23-), the variables -product_classification- , -idcontrato- and -tariff_ekon_id-.
    How could I operate that, please?

    If I am not mistaken, the order in -rangejoin- (i.e. what dataset I use as a master/using) can affect the final output.

    Thank you for your help!
    Best,
    Last edited by Michael Duarte Goncalves; 20 Dec 2023, 07:58.

  • #2
    Using rangejoin (from SSC), I believe you want:

    Code:
    use master
    rangejoin date_elec_consumption date_contract_start date_contract_end using using , by(id)

    Comment


    • #3
      Thank you so much for your answer, daniel klein.

      Perhaps one more question, please:
      • Will the result be the same if I reverse the master / using, or not?
      Thank you for your help and time!

      Comment


      • #4
        I am not using rangejoin a lot. The help tells us

        low or high can be specified using a numeric variable in the data in memory.
        which seems to imply that you cannot reverse the order of the datasets because you do not have date_contract_start and date_contract_end in your using data.

        However, there is no need to guess. Just try it yourself.

        Comment


        • #5
          Hi daniel klein,

          Thank you very much for your help and time. I will have a look at -help rangejoin-.
          Lovely day.

          All the best,
          Michael

          Comment

          Working...
          X