Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Split dataset after every second variable

    I have a dataset consisting of different exchange rate time series, each starting at different date. First always comes the date column and then the respective exchange rate column. I want to save each exchange rate with its date column as a separate dataset, so I want to separate the dataset after every second column. The names of the date columns might Change, so I would prefer not to use them, but the structure will always be the same.

    The goal is to then create one continuous date variable and merge the individual exchange rates back together using the date variable to create one dataset with all exchange rates ordered by a single date variable. I guess it shouldnt be hard to do, but I found no concise way to do it.

    Input:
    input str15 date1 AED_USD str15 date2 AFN_USD str15 date3 BEL_USD
    "01.01.1999" 0.8 "01.03.1990" 0.005 "01.01.1970" 1.5
    "02.01.1999" 0.81 "02.03.1990" 0.004 "02.01.1970" 1.5
    "03.01.1999" 0.82 "03.03.1990" 0.005 "03.01.1970" 1.5
    "04.01.1999" 0.79 "04.03.1990" 0.005 "04.01.1970" 1.5
    end
    list

  • #2
    The following example doesn't directly address your question, but perhaps will start you toward your ultimate objective.
    Code:
    clear
    input str15 date1 AED_USD str15 date2 AFN_USD str15 date3 BEL_USD
    "01.01.1999" 0.8 "01.03.1990" 0.005 "01.01.1970" 1.5
    "02.01.1999" 0.81 "02.03.1990" 0.004 "02.01.1970" 1.5
    "03.01.1999" 0.82 "03.03.1990" 0.005 "03.01.1970" 1.5
    "04.01.1999" 0.79 "04.03.1990" 0.005 "04.01.1970" 1.5
    end
    
    unab vars : *
    tokenize `vars'
    while "`1'" != "" {
        display "`1' `2'"
        rename `1' d_`2'
        rename `2' r_`2'
        macro shift 2
    }
    
    generate id = _n
    reshape long d_ r_, i(id) j(exch) string
    drop id
    generate date = daily(d_,"MDY")
    format date %td
    drop d_
    rename r_ rate
    order date exch rate
    sort exch date rate
    
    list, sepby(exch) noobs
    Code:
    . list, sepby(exch) noobs
    
      +----------------------------+
      |      date      exch   rate |
      |----------------------------|
      | 01jan1999   AED_USD     .8 |
      | 01feb1999   AED_USD    .81 |
      | 01mar1999   AED_USD    .82 |
      | 01apr1999   AED_USD    .79 |
      |----------------------------|
      | 03jan1990   AFN_USD   .005 |
      | 03feb1990   AFN_USD   .004 |
      | 03mar1990   AFN_USD   .005 |
      | 03apr1990   AFN_USD   .005 |
      |----------------------------|
      | 01jan1970   BEL_USD    1.5 |
      | 01feb1970   BEL_USD    1.5 |
      | 01mar1970   BEL_USD    1.5 |
      | 01apr1970   BEL_USD    1.5 |
      +----------------------------+

    Comment


    • #3




      Thanks for the clear explanation.

      This looks like a fairly standard reshape problem, but for some fooling around with variable names to make them consistent, and that in turn is written up within https://www.stata.com/support/faqs/d...-with-reshape/

      There may be a shorter way than this, but no choreography with different datasets is obviously needed.

      Code:
      clear
      input str15 date1 AED_USD str15 date2 AFN_USD str15 date3 BEL_USD
      "01.01.1999" 0.8 "01.03.1990" 0.005 "01.01.1970" 1.5
      "02.01.1999" 0.81 "02.03.1990" 0.004 "02.01.1970" 1.5
      "03.01.1999" 0.82 "03.03.1990" 0.005 "03.01.1970" 1.5
      "04.01.1999" 0.79 "04.03.1990" 0.005 "04.01.1970" 1.5
      end
      
      gen long obs = _n
      
      unab curr : *_USD
      local curr : subinstr local curr "_USD" "", all
      
      unab which : date*
      local which : subinstr local which "date" "", all
      
      foreach w of local which {
          gettoken this curr : curr
          rename date`w' date`this'
          rename `this'_USD USD`this'
      }
      
      reshape long date USD, i(obs) j(currency) string
      drop obs
      gen ndate = daily(date, "DMY")
      format ndate %td
      rename USD rate
      sort currency date
      list, sepby(currency)
      
           +------------------------------------------+
           | currency         date   rate       ndate |
           |------------------------------------------|
        1. |      AED   01.01.1999     .8   01jan1999 |
        2. |      AED   02.01.1999    .81   02jan1999 |
        3. |      AED   03.01.1999    .82   03jan1999 |
        4. |      AED   04.01.1999    .79   04jan1999 |
           |------------------------------------------|
        5. |      AFN   01.03.1990   .005   01mar1990 |
        6. |      AFN   02.03.1990   .004   02mar1990 |
        7. |      AFN   03.03.1990   .005   03mar1990 |
        8. |      AFN   04.03.1990   .005   04mar1990 |
           |------------------------------------------|
        9. |      BEL   01.01.1970    1.5   01jan1970 |
       10. |      BEL   02.01.1970    1.5   02jan1970 |
       11. |      BEL   03.01.1970    1.5   03jan1970 |
       12. |      BEL   04.01.1970    1.5   04jan1970 |
           +------------------------------------------+
          
      reshape wide rate, i(ndate) j(currency) string
      
      list, sep(4)
      
           +------------------------------------------------------+
           |     ndate   rateAED   rateAFN   rateBEL         date |
           |------------------------------------------------------|
        1. | 01jan1970         .         .       1.5   01.01.1970 |
        2. | 02jan1970         .         .       1.5   02.01.1970 |
        3. | 03jan1970         .         .       1.5   03.01.1970 |
        4. | 04jan1970         .         .       1.5   04.01.1970 |
           |------------------------------------------------------|
        5. | 01mar1990         .      .005         .   01.03.1990 |
        6. | 02mar1990         .      .004         .   02.03.1990 |
        7. | 03mar1990         .      .005         .   03.03.1990 |
        8. | 04mar1990         .      .005         .   04.03.1990 |
           |------------------------------------------------------|
        9. | 01jan1999        .8         .         .   01.01.1999 |
       10. | 02jan1999       .81         .         .   02.01.1999 |
       11. | 03jan1999       .82         .         .   03.01.1999 |
       12. | 04jan1999       .79         .         .   04.01.1999 |
           +------------------------------------------------------+


      .EDIT: Didn't see @WIlliam Lisowski's #2 before posting this. His approach to renaming is much neater than mine.
      .
      .
      Last edited by Nick Cox; 04 Apr 2019, 09:38.

      Comment


      • #4
        Thanks a lot, both approaches worked !

        Comment

        Working...
        X