Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Residual Expanded Market Model

    Dear Stata Community!

    I'm very new in stata and am now working on my thesis. I want to calculate Stock Price Crash Risk (NSKEW) using this equation:

    NSKEW = - [n(n-1)3/2 ΣW3 j,τ] / [(n-1)(n-2) (ΣW2j,τ)3/2] ...... (1)
    from the paper that I read, before calculating NSKEW, I need to calculate the firm-specific weekly return for firm j in week τ by this equation:

    (Wj,τ = ln(1+ej,τ)) ...... (2)
    the problem is to get the firm-specific weekly return, I dont know how to get the residual return for each firm using this equation:

    rj,τ = ⍺j + 𝛾1,j rm,𝝉-2 + 𝛾2,j rm,𝝉-1 + 𝛾3,j rm,𝝉 + 𝛾4,j rm,𝝉+1 + 𝛾5,j rm,𝝉+2 + ej,𝝉 ......... (3)
    I've seen a tutorial on youtube on how to solve this problem, but he calculates the residual return using R, here is the youtube link: https://www.youtube.com/watch?v=sjCs6eAdzn0

    this is the code line that he uses:
    * First, initialize the residual matrix to a null matrix, the size is based on the size of the input data

    residuals_expanded model<-matrix (0,517,100)

    * Then, generate the residuals matrix using the expanded market model

    for (i in 1:100) {

    Residuals_ expanded model [,i] <- resid(Im(crash data[,i+1] ~ crash data[,102] + crash data[, 103] + crash data[, 104] + crash data [,105] + crash data [,106]))

    }
    Can anyone help me to build command line for stata using the same concept?


    note: I also attach my stata data


    best regard,

    Yusuf
    Attached Files

  • #2
    This R code gives the regression formula for company i, based on the last five variables in the dataset:

    Code:
    crash data[,i+1] ~ crash data[,102] + crash data[, 103] + crash data[, 104] + crash data [,105] + crash data [,106]
    The "linear model" or lm() function (that's lm with an L, not Im with a capital i as you have above) will actually fit the linear regression, and the resid() function will generate residuals based on the predicted values.

    To do this in Stata, I'm just going to quickly rename your variables Company1-Company88. This should make it easier to follow the example. I also don't really know what the naming convention is for these variables and I want to make it easier to write a loop.

    Code:
    // There are 93 variables. Need to exclude the last 5.
    local counter = 1
    foreach var of varlist _all{
        if `counter' > 88{
            continue, break
        }
        rename `var' company`counter'
        local counter = `counter' + 1
    }
    Now we can get the same loop in Stata that the youtuber has in R. We could store the residuals in a matrix, but I think it is more straightforward in Stata to just store these values in new variables.

    Code:
    foreach var of varlist company*{
        reg `var' rm_0 rm_lag1 rm_lag2 rm_lead1 rm_lead2
        predict res_`var', residuals
    }
    I notice he also calculates the "firm specific weekly returns" like this:

    Code:
    firm_spe_week_retur <- log(matrix(1, 517, 100) + Residuals_expanded_model))
    Which literally just means "add one to each residual, then take the log." We can easily modify the Stata loop above to produce these values like so:

    Code:
    foreach var of varlist company*{
        reg `var' rm_0 rm_lag1 rm_lag2 rm_lead1 rm_lead2
        predict res_`var', residuals
        gen fswr_`var' = log(res_`var' + 1)
    }
    Suppose you now want a dataset with only the fswr variables. You can just drop everything else with:

    Code:
    keep fswr_*
    And you can save the data with the -save- command. Finally, suppose you don't like that I've renamed your companies company1. You want to preserve the company names or whatever these variable names indicate. You can do this by modifying the first loop like so:

    Code:
    local counter = 1
    foreach var of varlist _all{
        if `counter' > 88{
            continue, break
        }
        reg `var' rm_0 rm_lag1 rm_lag2 rm_lead1 rm_lead2
        predict res_`var', residuals
        gen fswr_`var' = log(res_`var' + 1)
        local counter = `counter' + 1
    }
    Last edited by Daniel Schaefer; 26 Sep 2023, 16:23.

    Comment


    • #3
      You could also put the -quietly- command in front of each reg command like so:

      Code:
      quietly reg `var' rm_0 rm_lag1 rm_lag2 rm_lead1 rm_lead2
      That should stop the output from being printed to the console. That can be good because it prevents your console buffer from filling up, but it might be best to scroll through these regression results to make sure there aren't any obvious problems. It's up to you.
      Last edited by Daniel Schaefer; 26 Sep 2023, 16:16.

      Comment


      • #4
        Yusef,

        Welcome to Statalist. You will find that you get quicker--and better-- answers to your questions if they are posed in a way which conforms to the excellent FAQs. In particular, you need to explain your question and your data more clearly. Use dataex to display a representative sample, not an attachment most listers won't open. If you can explain your problem using a small publicly available dataset, even better. Finally, a search of the forum will often give you the answer you seek.

        All that said, the code below-- almost identical to what you will find in this post by Richard Williams- might get you partway there. It saves coefficients and RMSEs by group. You can use the saved coefficients to calculate predictions and residuals if that's what you want.

        Code:
        sysuse auto
        statsby _b _cons e(rmse), by(foreign) noisily nodots: regress price weight length mpg
         list
        Devra Golbe
        Professor Emerita, Dept. of Economics
        Hunter College, CUNY

        Comment


        • #5
          Daniel Schaefer i'm crying. Thank you so much for your help, it means a lot. I've just tried it and its works

          Comment


          • #6
            Originally posted by Devra Golbe View Post
            Yusef,

            Welcome to Statalist. You will find that you get quicker--and better-- answers to your questions if they are posed in a way which conforms to the excellent FAQs. In particular, you need to explain your question and your data more clearly. Use dataex to display a representative sample, not an attachment most listers won't open. If you can explain your problem using a small publicly available dataset, even better. Finally, a search of the forum will often give you the answer you seek.

            All that said, the code below-- almost identical to what you will find in this post by Richard Williams- might get you partway there. It saves coefficients and RMSEs by group. You can use the saved coefficients to calculate predictions and residuals if that's what you want.

            Code:
            sysuse auto
            statsby _b _cons e(rmse), by(foreign) noisily nodots: regress price weight length mpg
            list
            Thank you for reminding me, sorry for my mistake and I'll follow the rule next time. Thanks too for the code line, I'll learn from that

            Comment

            Working...
            X