Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Test for heteroskedasticity and robust ols regression

    Hello all,

    I would like to find out if i have to do a robustness test for my ols regressions because of the existence of heteroskedasticity. For this I have some understanding questions first before I have a question regarding the commands.

    My dataset is like this:

    fy | firmid | earning | dvpsx_f |prior_dividend | deltadividend
    1 | 5 | .66 .48 .48 1.07e-08
    1 | 5 | 1 .48 .48 1.07e-08
    1 | 5 | 1.38 .48 .48 1.07e-08
    1 | 5 | 1.29 .33 .48 -.15
    1 | 5 | 1.51 .34 .33 .01
    1 | 5 | 1.3 .34 .34 -3.58e-09
    1 | 6 | .69 .34 .34 -3.58e-09
    1 | 6 | -2.08 .16 .34 -.18
    1 | 6 | -.39 .025 .16 -.135
    2 | 5 | .11 0 .025 -.025
    2 | 5 | 2.04 0 0 0
    2 | 5 | 2.12 0 0 0
    2 | 5 | 1.17 0 0 0
    2 | 6 | 1.85 .075 0 .075
    2 | 6 | 1.68 .3 .075 .225
    3 | 8 | 1.38 .3 .3 -1.19e-08
    3 | 8 | 1.85 .3 .3 -1.19e-08
    3 | 9 | -1.4 .3 .3 -1.19e-08


    I have a regression in a loop and want to do a cross-section analysis of the firms for each fy (subperiod).


    So I test my regressions for homoskedasticity with the breusch-pagan test and the white test after i did my loop regression.
    Code:
    egen newid = group(firmid),  
    sum newid, d
    return list
    
    local max=r(max)
    
    forvalues id = 1/`max' {
       forvalues p = 1/2 {
       capture noisily regress deltadividend prior_dividend earning if newid ==`id' &  fy == `p'
        }
    }
    
    
    . hettest
    
    Breusch–Pagan/Cook–Weisberg test for heteroskedasticity
    Assumption: Normal error terms
    Variable: Fitted values of deltadividend
    
    H0: Constant variance
    
        chi2(1) =   0.09
    Prob > chi2 = 0.7663
    
    . imtest, white
    
    White's test
    H0: Homoskedasticity
    Ha: Unrestricted heteroskedasticity
    
        chi2(5) =   2.05
    Prob > chi2 = 0.8417
    Is this even correct to do the tests for my whole dataset with all 3 fy as one or should i do it for each fy?

    For the understanding for myself: The Breusch-pagan test in my case is with 0.7663 larger than the p value of 0.05 which means that my regression is homoskedasticity. This is then supported by white's test (0.8417>0.05 fail to reject null hypothesis), am I interpreting this correctly?
    Also can it happen that with these two tests that only one null hypothesis is rejected and the other not, so that one test would have homoskedasticity and the other one heteroskedasticity? what would be the next step here then?

    Let's assume I have a value of less than 0.05 in the breusch-pagan test and in the white's test for another regression in my research, then I would have to check my regression for robust standard error, am i right?

    I had previously set up my command for normal regression analysis like this:
    Code:
    statsby _b _se SOA = (-(_b[prior_dividend]))  TP = (-((_b[earning])/(_b[ prior_dividend]))) adj_r_squared = e(r2_a), by(newid fy) clear : regress deltadividend prior_dividend earning
    
    list, sep(0)
    
    
    rename _eq2_TP Target_Payout
    rename _eq2_SOA Speed_of_adjustment
    rename _eq2_adj_r_squared Adj_R_squared
    rename _b_cons Constant
    
    replace Target_Payout = 0 if Target_Payout == .
    replace Speed_of_adjustment = 0 if Speed_of_adjustment == .
    
    //Summarize
    
    summarize Constant Speed_of_adjustment Target_Payout Adj_R_squared if fy==1, detail
    summarize Constant Speed_of_adjustment Target_Payout Adj_R_squared if fy==2, detail
    
    
    collect dims
    
    collect clear
    
    foreach i in Constant Speed_of_adjustment Target_Payout Adj_R_squared {
     
        collect: summarize `i' if fy == 1 , detail
    
    }
    
    collect title  Subperiod 1984-2002 (N=743)
    collect label values result mean "Average" sd "Standard Deviation" p25 "25th Percentile" p50 "Median" p75 "75th Percentile", modify
    collect style column, extraspace(1)
    collect label levels cmdset 1 "Constant" 2 "Speed of adjustment" 3 "Target Payout" 4 "Adjusted R²" , modify
    collect style cell result[mean sd p25 p50 p75], nformat(%9.3f)
    collect stars mean p25 p50 p75 sd 0.01 "***" 0.05 "**" 0.1 "*", attach(mean median p25 p75 sd) nformat(%9.7g) shownote
    
    collect layout (cmdset) (result[mean sd p25 p50 p75]) (), name(default)
    With this code i get a table with the standard deviation as one of the value.


    Subperiod 1984-2002 (N=743)
    ------------------------------------------------------------------------------------------------
    | Average Standard Deviation 25th Percentile Median 75th Percentile
    --------------------+---------------------------------------------------------------------------
    Constant | 0.199** 1.092** 0.000** 0.025 0.185**
    Speed of adjustment | 0.389* 0.371* 0.063* 0.324 0.637*
    Target Payout | 0.096* 1.643* 0.000* 0.071 0.242*
    Adjusted R² | 0.324 0.265 0.108 0.294 0.497
    ------------------------------------------------------------------------------------------------
    *** p<.01, ** p<.05, * p<.1


    Now if i want to do a robust regression , how do i rewrite it? For the robust regression i use the vce(robust) in the loop (correct?), but how do i extract the robust standard error and list it in the end as robust standard deviation? I had tried _rse (for robust standard error) but i just get the message "_rse not found"? What do I have to use there to get "robust standard deviation" in my table instead of "standard deviation"?

    I hope someone can enlighten me about this.
    Thanks for the help.
    Last edited by Steffen Scheifele; 28 Oct 2022, 05:36.

  • #2
    Cross-posted at https://www.reddit.com/r/stata/comme...oskedasticity/ Please note our policy about cross-posting, which is that you should tell us about it. Reddit have the same policy in essence.

    I guess you didn't get any response there for a mix of reasons, such as your post being long and complicated and asking quite a lot.

    You need someone with econometrics expertise to answer, which rules me out. Otherwise I can't speak on anyone else's behalf but from experience of this forum I wouldn't be surprised if anyone on that side regarded this, frankly, as too much like textbook stuff or as questions you should be directing at your local teachers -- especially if this is for an assignment.

    Comment


    • #3
      Steffen:
      some comments about you (a tad too long post):
      1) assuming that your questions do not pertain to assignments, you seem to be dealing with a N&gt;T panel dataset.
      Hence, why going -regress- instead of -xtreg-?
      2) usually, -estat hettest- and imtest- outcomes go in the same direction; when this is not the case, you should study the null of the two tests to find out why they differ (and this one is an a good habit for each and every test);
      3) if you detect heteroskedasticity after -regress-, you can simply call the -robust- option for your standard errors (the -vce(robust)- code works, too, but I'd avoid the -vce- prefix, which is mandatory when you go clustered standard errors instead: -vce(cluster clusterid-);
      4) -rreg- (if is this the code you implicitly refer to) has nothing in common with -regress, robust-.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment

      Working...
      X