Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing Values From Lagged Variables

    Good Afternoon,

    I attempted to create a new 1-lag variable. I succeeded, however the new variable contains numerous missing values and gaps. Can anyone advise?

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(deviationfromaverage for10_deviationfromaverage)
             .          .
         -3.19          .
         3.655      -3.19
      25.05667      3.655
       14.5425          .
     4.5140004    14.5425
      9.951667  4.5140004
         23.28   9.951667
          19.8      23.28
         15.04          .
        17.596      15.04
     17.246365     17.596
     13.559168  17.246365
     14.956153  13.559168
     12.947857          .
     2.0246668  12.947857
     12.028126  2.0246668
      17.19059          .
     12.105555          .
      11.28842  12.105555
        11.724   11.28842
    -2.0114992          .
     -4.651665          .
      8.343159  -4.651665
      9.365998   8.343159
      8.319001   9.365998
        27.036   8.319001
      25.46889          .
      29.80842   25.46889
        27.508   29.80842
        21.585     27.508
     18.501001     21.585
     15.225554          .
     16.174212  15.225554
      20.65579  16.174212
     12.868947   20.65579
      5.702222          .
     11.092105   5.702222
     -5.462499  11.092105
     -9.178498  -5.462499
     -5.430999  -9.178498
     -6.068946          .
     1.1044981  -6.068946
     1.0519049  1.1044981
     -8.238095  1.0519049
     1.5066677  -8.238095
      2.359474          .
    -2.6984994   2.359474
      9.439999 -2.6984994
     8.2047615   9.439999
      13.02762  8.2047615
     13.314736          .
        12.609  13.314736
      12.63857     12.609
          7.07   12.63857
      12.04421          .
     13.882002   12.04421
     15.780953  13.882002
     12.751904  15.780953
     15.024763  12.751904
     10.166843          .
     4.6884966  10.166843
     3.7152376  4.6884966
       5.16714  3.7152376
         24.17    5.16714
      25.21211          .
     25.761503   25.21211
     22.974766  25.761503
     21.550955  22.974766
     19.037146  21.550955
      14.89737          .
     21.532503   14.89737
     20.067144  21.532503
     14.965004  20.067144
        12.395  14.965004
     15.245503          .
      15.45952  15.245503
     15.936815   15.45952
     13.708185  15.936815
     3.5645444  13.708185
     -2.966001          .
    -3.8947656  -2.966001
     2.0422745 -3.8947656
     -5.029996  2.0422745
    -32.546818  -5.029996
       -37.346          .
     -30.43762    -37.346
     -28.68409  -30.43762
     -26.93818  -28.68409
    -26.796316          .
    -35.396503 -26.796316
     -45.02095 -35.396503
     -42.09809  -45.02095
     -30.77857  -42.09809
     -30.82263          .
    -34.281498  -30.82263
     -40.20905 -34.281498
    -31.298094  -40.20905
     -41.35714 -31.298094
    -33.602634          .
    end

    Here is my do-file (go straight to the end)

    Code:
    *Eurodollar Futures Analysis
    
    
    clear // clear all
    version 15 // uses Stata version 15
    set more off // runs commands at one go
    capture log close //closes any preexisting logs
    log using eurodollarprojectlog , replace // creates new log
    
    
    import excel "C:\stata\fedhistoricaldata.xls", firstrow
    
    duplicates list Date
    count if missing(Date)
    list if missing(Date)
    
    //Should I use gen double Date1 = clock(Date, "DMY")?
    drop in 3871/3883
    
    
    save fedhistoricaldata , replace
    clear
    
    import excel "C:\stata\bundlefutureshistoricalpricing.xlsx", firstrow 
    
    duplicates list Date
    count if missing(Date)
    list if missing(Date)
    
    drop in 2685/2698
    
    
    
    
    save  "C:\stata\bundlefutureshistoricalpricing.dta", replace
    
    
    // drop if Date == . // deletes missing variables
    
    
    
    
    
    merge 1:1 Date using fedhistoricaldata
    
    
    * delete missing values
    
    drop if stBU2 == . 
    
    *Start the analysis
    
    gen cumbasispoint=sum(stBU2)
    gen tendaybasispointreturn =cumbasispoint-cumbasispoint[_n-10]
    
    *Generate volatility of daily returns
    
    gen dayvol = stBU2 ^ 2
    
    *Set TSSET
    
    gen assetclass = 1
    
    tsset assetclass  Date , daily
    
    
    *Partial Autocorrelation graph
    
    pac dayvol, lag(10)
    
    *Create daily lagged variables
    
    foreach i of num 1/4 {
    
    gen lag`i'_dayvol = l`i'.dayvol
    
    
    
    }
    drop if stBU2 == . 
    
    regress dayvol lag1_dayvol lag2_dayvol lag3_dayvol lag4_dayvol, robust
    
    * create variable to measure deviation from reference point
    
    tssmooth  ma  averageind =sum(stBU2), window(30 0 0) // 30 period window
    
    gen deviationfromaverage = sum(stBU2) - averageind
    
    gen for10_deviationfromaverage = L.deviationfromaverage // create new variable

  • #2
    Your sample data does not include your Date variable, and my guess is that your data include only weekdays, not weekends, so the lagged value for Monday would be the non-existent value from Sunday, rather than the value from the previous Friday.

    Comment


    • #3
      Hi William,

      You're right! How could I not have thought of that? I'm actually using stock exchange data and it does leave out weekend data.

      Do you know how I can make Stata take the value on Friday instead of the value on Sunday instead? I'm not sure why Stata behaves like this because the lagged value is supposed to be that of the preceding row, isn't it?

      Linus

      Comment


      • #4
        Further to your advice, I tried to apply a tip on the Stata website on how the program can be coached to ignore weekend dates. Unfortunately, I still seem to be facing some difficulty with the below mentioned code not really working.

        Code:
        * "Converting daily financial data into business calendar dates"
        version 15
        dateformat dmy
        range 12feb2003 08jan2014
        centerdate 01jan2008
        omit dayofweek (Sa Su)
        The whole do. file:

        Code:
        *Eurodollar Futures Analysis
        
        
        clear // clear all
        version 15 // uses Stata version 15
        set more off // runs commands at one go
        capture log close //closes any preexisting logs
        log using eurodollarprojectlog , replace // creates new log
        
        
        import excel "C:\stata\fedhistoricaldata.xls", firstrow
        
        duplicates list Date
        count if missing(Date)
        list if missing(Date)
        
        //Should I use gen double Date1 = clock(Date, "DMY")?
        drop in 3871/3883
        
        
        save fedhistoricaldata , replace
        clear
        
        import excel "C:\stata\bundlefutureshistoricalpricing.xlsx", firstrow 
        
        duplicates list Date
        count if missing(Date)
        list if missing(Date)
        
        drop in 2685/2698
        
        
        
        
        save  "C:\stata\bundlefutureshistoricalpricing.dta", replace
        
        
        // drop if Date == . // deletes missing variables
        
        
        
        
        
        merge 1:1 Date using fedhistoricaldata
        
        
        * "Converting daily financial data into business calendar dates"
        version 15
        dateformat dmy
        range 12feb2003 08jan2014
        centerdate 01jan2008
        omit dayofweek (Sa Su)
        
        
        
        * delete missing values
        
        drop if stBU2 == . 
        
        *Start the analysis
        
        gen cumbasispoint=sum(stBU2)
        gen tendaybasispointreturn =cumbasispoint-cumbasispoint[_n-10]
        
        *Generate volatility of daily returns
        
        gen dayvol = stBU2 ^ 2
        
        *Set TSSET
        
        gen assetclass = 1
        
        tsset assetclass  Date , daily
        
        
        *Partial Autocorrelation graph
        
        pac dayvol, lag(10)
        
        *Create daily lagged variables
        
        foreach i of num 1/4 {
        
        gen lag`i'_dayvol = l`i'.dayvol
        
        
        
        }
        drop if stBU2 == . 
        
        regress dayvol lag1_dayvol lag2_dayvol lag3_dayvol lag4_dayvol, robust
        
        * create variable to measure deviation from reference point
        
        tssmooth  ma  averageind =sum(stBU2), window(30 0 0) // 30 period window
        
        gen deviationfromaverage = sum(stBU2) - averageind
        
        gen for10_deviationfromaverage = L.deviationfromaverage // create new variable

        Comment


        • #5
          The error message is

          . dateformat dmy
          command dateformat is unrecognized

          Comment


          • #6
            The Date variable:

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input int Date
            16040
            16041
            16042
            16043
            16044
            16047
            16048
            16049
            16050
            16051
            16054
            16055
            16056
            16057
            16058
            16061
            16062
            16063
            16065
            16068
            16069
            16070
            16072
            16075
            16076
            16077
            16078
            16079
            16082
            16083
            16084
            16085
            16086
            16090
            16091
            16092
            16093
            16096
            16097
            16098
            16099
            16100
            16103
            16104
            16105
            16106
            16107
            16110
            16111
            16112
            16113
            16114
            16118
            16119
            16120
            16121
            16124
            16125
            16126
            16127
            16128
            16131
            16132
            16133
            16134
            16135
            16138
            16139
            16140
            16141
            16142
            16145
            16146
            16147
            16148
            16149
            16152
            16153
            16154
            16155
            16156
            16159
            16160
            16161
            16162
            16163
            16166
            16167
            16168
            16169
            16173
            16174
            16175
            16176
            16177
            16180
            16181
            16182
            16183
            16184
            end
            format %tdnn/dd/CCYY Date

            Comment


            • #7
              This is the site which I referred to: https://blog.stata.com/2016/02/04/ha...ess-calendars/

              Comment


              • #8
                You misunderstand the instructions in the blog post. The commands described, including the dateformat command, are to be used to create a business calendar file saved with the .stbcal suffix. They are not meant to be run in your do-file; there you include the code given in the FAQ section "Using a Business Calendar".

                Comment

                Working...
                X