Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • rangestat and the implementation of lagged variables

    Dear all,

    I am facing a problem using rangestat and the implementation of lagged variables.
    Please see below an extract of my dataset. It includes the company code, date, dm; (a variable for month description). Dailyreturn is the daily return per company and Marketreturn is the daily market return.
    I want to run two regressions for which I want to obtain the R^2 of each regression, as well as the sum of the absolute values of the coefficients and the standard errors. The regressions are run per company per month.

    First I group company codes and months to be able to run regression per company per month:
    egen cus=group(isin dm)

    Then I run the regression I want:
    rangestat (reg) dailyreturn marketreturn , interval(dm 0 0) by(cus)

    I think this one works fine (not sure), but the problem lies in the second regression.

    In the second regression I want to include up to five lagged days for the market return. The regression should look like this:
    π‘Ÿπ‘—,𝑑=π‘Žπ‘—+π›½π‘—π‘…π‘š,𝑑+π‘…π‘š,π‘‘βˆ’1+π‘…π‘š,π‘‘βˆ’2+π‘…π‘š,π‘‘βˆ’3+π‘…π‘š,π‘‘βˆ’4+π‘…π‘š,π‘‘βˆ’5+ πœ€π‘—,𝑑

    So I create:
    bysort cus: gen lag1 = marketreturn[_n-1]
    bysort cus: gen lag2 = marketreturn[_n-2]
    bysort cus: gen lag3 = marketreturn[_n-3]
    bysort cus: gen lag4 = marketreturn[_n-4]
    bysort cus: gen lag5 = marketreturn[_n-5]

    Next, I run the following regression:

    rangestat (reg) dailyreturn marketreturn lag1 lag2 lag3 lag4 lag5 , interval(dm 0 0) by(cus)

    This decreases the number of observations which is not possible and sometimes the r squared in this model is lower than in the first which is also not possible.
    Doe anyone know a solution for this?

    Kind regards,
    Philip

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str12 isin float(date dailyreturn marketreturn dm cus)
    "AU000000AGL7" 16477   -.01261533    .000595743 541 3
    "AU000000AGL7" 16478 -.0014199505 -.00010046188 541 3
    "AU000000AGL7" 16481  -.013514671   -.005666091 541 3
    "AU000000AGL7" 16482  .0028546855  -.0013095372 541 3
    "AU000000AGL7" 16483   .007835613    .001578935 541 3
    "AU000000AGL7" 16484    -.0135641    .003574669 541 3
    "AU000000AGL7" 16485   .012851356  -.0008343064 541 3
    "AU000000AGL7" 16488   .010584725   .0011386069 541 3
    "AU000000AGL7" 16489 -.0014119764   -.003667351 541 3
    "AU000000AGL7" 16490  -.006335591   -.008238011 541 3
    "AU000000AGL7" 16491  -.013535118   -.005374756 541 3
    "AU000000AGL7" 16492   -.01008541    .008050131 541 3
    "AU000000AGL7" 16495   .031368095    .010065738 541 3
    "AU000000AGL7" 16496   .002796667   .0028134254 542 4
    "AU000000AGL7" 16497  .0020976982    .001524116 542 4
    "AU000000AGL7" 16498   .006964237   .0031767096 542 4
    "AU000000AGL7" 16499   .002763744    .003583379 542 4
    "AU000000AGL7" 16502   -.03089478   .0042426297 542 4
    "AU000000AGL7" 16503   .007101903    .001182646 542 4
    "AU000000AGL7" 16504   .001422966  -.0001883758 542 4
    "AU000000AGL7" 16505  -.002837157   -.006134503 542 4
    "AU000000AGL7" 16506  -.006405033   -.002806948 542 4
    "AU000000AGL7" 16509    .02397743    .007013269 542 4
    "AU000000AGL7" 16510  -.003487733  -.0018562608 542 4
    "AU000000AGL7" 16511   .005576667   .0044506625 542 4
    "AU000000AGL7" 16512  .0013866764  -.0028893785 542 4
    "AU000000AGL7" 16513   .003474114    .003830276 542 4
    "AU000000AGL7" 16516  .0006849496    .004003018 542 4
    "AU000000AGL7" 16517  -.002763744   -.007644499 542 4
    "AU000000AGL7" 16518  -.017486261   -.014975418 542 4
    "AU000000AGL7" 16519  .0042156815   -.008481762 542 4
    "AU000000AGL7" 16520            0             0 542 4
    "AU000000AGL7" 16523            0             0 542 4
    "AU000000AGL7" 16524  .0042086435   -.010048994 542 4
    "AU000000AGL7" 16525  -.005611897   -.000637074 542 4
    "AU000000AGL7" 16526 -.0007077293    .003736099 542 4
    "AU000000AGL7" 16527   .001414958    .006933059 543 5
    "AU000000AGL7" 16530   .005607941  -.0043020505 543 5
    "AU000000AGL7" 16531   .002797532    .002706375 543 5
    "AU000000AGL7" 16532  -.002797532   -.003143182 543 5
    "AU000000AGL7" 16533  -.024782626   .0019956238 543 5
    "AU000000AGL7" 16534  .0035750915    .008961359 543 5
    "AU000000AGL7" 16537  -.013660502   -.007893652 543 5
    "AU000000AGL7" 16538  .0014438685   .0020474845 543 5
    "AU000000AGL7" 16539    .00289248  -.0018369828 543 5
    "AU000000AGL7" 16540  -.006511595   -.012853763 543 5
    end
    format %td date
    format %tm dm

  • #2
    I not not sure since I am new to stata but maybe the problem is that you are using (bysort cus) to create a lag value, which include company and time values together. It might be better to use for example bysort isin: gen lag1 = L. marketreturn where L. is the official way to create lag values. Hope it helps!!

    Comment


    • #3
      The number of observations reported is, as with regress and any similar model fitting command, the number of observations used in the regression or model fit. That can easily go down for the same dataset depending on what variables appear in the model and what values are missing.

      The differences between your regressions are marked:

      1. The lagged variables are not all available for the first 6 observations of each group

      2. For a model with 6 predictors not 1, you need more observations any way to get a fit at all. This won't bite, I presume, with a full month's data but could bite with incomplete months and/or missing values.

      What happens to R-square is empirical.

      A quite different issue is that you have daily data with no observations at weekends. Your model doesn't adjust for that, although you might not care and you are using subscripts, not explicit lags.

      I don't think this is anything peculiar to rangestat (SSC, as you are asked to explain). I am fond of that command for various reasons but note that statsby or runby (SSC) should give similar results.

      Comment


      • #4
        Dear Philip, You can try
        Code:
        bys isin (date): gen t = _n
        egen id = group(isin)
        xtset id t
        
        gen marketreturn1 = L1.marketreturn
        gen marketreturn2 = L2.marketreturn
        gen marketreturn3 = L3.marketreturn
        gen marketreturn4 = L4.marketreturn
        gen marketreturn5 = L5.marketreturn
        
        rangestat (reg) dailyreturn marketreturn marketreturn1 marketreturn2 marketreturn3 marketreturn4 marketreturn5, interval(dm 0 0) by(isin dm)
        Ho-Chuan (River) Huang
        Stata 19.0, MP(4)

        Comment


        • #5
          Dear all,

          Thank you for responding, I really appreciate.
          Mr Cox, I agree, rangestat is a very powerful tool and it has solved many problems for me.

          Sorting by isin indeed seems to solve the problem. I will have to take a closer look if indeed the lagged variables don’t have to be implemented each month.

          Mr Huang, the code works fine!

          Thank you very much.

          Kind regards,
          philip

          Comment


          • #6
            Indeed, even if you want separate regressions for each month, using the last days of the previous month to define lagged predictors for the first part of each month seems defensible.

            But back to rangestat: much of the point of this command is that it allows moving windows. Unless there is a substantive reason to think that your system changes because the month is called something else, have you thought about say 30 day moving windows?

            (Best not to use titles like Mr Mrs Ms Dr Professor here unless people so describe themselves. It is easy to get them wrong and there seems overwhelming preference for informality.)

            Comment

            Working...
            X