Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Allocating monthly observation to daily data

    Hello,

    I am trying to allocate an observation which is recorded at the last day of a month to a dataset which contains daily observations in a way such that the observation which is recorded at 29jan1993 is allocated to all daily observations I have in jan1993 in my second dataset.
    My monthly dataset as well as my daily dataset range both from January 1993 to December 2014.
    The second dataset contains different stocks, therefore each date is listed several times in the second dataset.

    The monthly dataset looks like this:

    date Sentiment
    29jan1993 0,72
    26feb1993 0,83
    31mar1993 -0,88
    ...
    31dec2014. 0,01

    The daily dataset looks like this:

    date var1 var2 var3 var4
    04jan1993
    04jan1993
    04jan1993
    04jan1993
    04jan1993
    05jan1993
    05jan1993
    ​​​​​​​05jan1993
    ​​​​​​​05jan1993
    ​​​​​​​05jan1993
    ...
    31dec2014
    ​​​​​​​31dec2014
    ​​​​​​​31dec2014
    ​​​​​​​31dec2014
    ​​​​​​​31dec2014

    My question how it is possible to transfer the observations from one dataset to another and how to allocate the one observation per month in the first dataset to all daily observations I have during a month in the second dataset.

    Thank you very much!

  • #2
    In short, what you want to do are to (a) create Stata monthly dates in each of your datasets and (b) merge your datasets matching observations on month.

    Before showing the technique, let me offer some advice that will help you understand a key concept. Stata's "date and time" variables are complicated and there is a lot to learn. If you have not already read the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF, do so now. If you have, it's time for a refresher. After that, the help datetime documentation will usually be enough to point the way. You can't remember everything; even the most experienced users end up referring to the help datetime documentation or back to the manual for details. But at least you will get a good understanding of the basics and the underlying principles. An investment of time that will be amply repaid.

    I'll also add, if you are not familiar with merging data, reviewing the output of help merge will also give you important background to this code.

    With that said, here is some sample code. In the first part I create two temporary datasets that stand in for your actual datasets. That is, this is just to set up data to use in the second part of my code, which does the actual work.
    Code:
    // create datasets with example data
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float date byte(var1 var2 var3 var4)
    12057  6 6  5  4
    12057  5 7  3  6
    12057  3 2  5  3
    12057  1 8  4  6
    12057  6 2  1  3
    12098  2 6  6  5
    12098  6 1 10 10
    12098  9 8  6  2
    12098  4 3  3  7
    12098 10 8  5  1
    end
    format %td date
    tempfile master
    save `master'
    
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(date sentiment)
    12082  .72
    12110  .83
    12143 -.88
    end
    format %td date
    tempfile sent
    save `sent'
    
    // the work starts here
    use `sent', clear
    generate month = mofd(date)
    format month %tm
    list, clean
    save `sent', replace
    
    use `master', clear
    generate month = mofd(date)
    format month %tm
    merge m:1 month using `sent'
    list, clean
    Here is the output when this is run.
    Code:
    . use `sent', clear
    
    . generate month = mofd(date)
    
    . format month %tm
    
    . list, clean
    
                date   sentim~t    month  
      1.   29jan1993        .72   1993m1  
      2.   26feb1993        .83   1993m2  
      3.   31mar1993       -.88   1993m3  
    
    . save `sent', replace
    file /var/folders/xr/lm5ccr996k7dspxs35yqzyt80000gp/T//S_34312.000002 saved
    
    . 
    . use `master', clear
    
    . generate month = mofd(date)
    
    . format month %tm
    
    . merge m:1 month using `sent'
    
        Result                           # of obs.
        -----------------------------------------
        not matched                             1
            from master                         0  (_merge==1)
            from using                          1  (_merge==2)
    
        matched                                10  (_merge==3)
        -----------------------------------------
    
    . list, clean
    
                date   var1   var2   var3   var4    month   sentim~t           _merge  
      1.   04jan1993      6      6      5      4   1993m1        .72      matched (3)  
      2.   04jan1993      5      7      3      6   1993m1        .72      matched (3)  
      3.   04jan1993      3      2      5      3   1993m1        .72      matched (3)  
      4.   04jan1993      1      8      4      6   1993m1        .72      matched (3)  
      5.   04jan1993      6      2      1      3   1993m1        .72      matched (3)  
      6.   14feb1993      2      6      6      5   1993m2        .83      matched (3)  
      7.   14feb1993      6      1     10     10   1993m2        .83      matched (3)  
      8.   14feb1993      9      8      6      2   1993m2        .83      matched (3)  
      9.   14feb1993      4      3      3      7   1993m2        .83      matched (3)  
     10.   14feb1993     10      8      5      1   1993m2        .83      matched (3)  
     11.   31mar1993      .      .      .      .   1993m3       -.88   using only (2)

    Comment


    • #3
      Hello William,

      thanks for your explanations they helped me a lot and I was able to solve the problem.

      Comment

      Working...
      X