Random Forest R squared

John Schawrz

Join Date: Nov 2019
Posts: 30

Random Forest R squared

17 Jul 2023, 07:51

Hi everyone,
I am running a random forest with a categorical dependant variable. I need to compute (pseudo) R squared to compare model fit with a multinomial logit. It seems to me that the rforest Stata command does not directly output some type of a fit measure. The code I am using is this, which produces the output and graphs the variable importance.

Code:

*** Random Forest ***

rforest y x1 x2 x3 x4 x5 x6 x7 x8, type(class) iterations(2000)
 
*Output the statistics computed so far (note that the OOB error is computed at this stage)
ereturn list

*Compute expected values for variable weight
predict pred

*List the first five entries of variables
list y x1 x2 x3 x4 x5 x6 x7 x8 in 1/5

*Create a copy of the variable-importance matrix stored in e()
matrix importance = e(importance)

*Convert the matrix to a variable
svmat importance

*List the first five entries in the variable importance
list importance in 1/5

*Generate new variable id to be used for labeling
generate id=""

*Attach unique labels to individual columns in the chart
        local mynames : rownames importance
        local k : word count `mynames'
            // If there are more variables than observations
            if `k'>_N {
                set obs `k'
            }
            forvalues i = 1(1)`k' {
                local aword : word `i' of `mynames'
                local alabel : variable label `aword'
                if ("`alabel'"!="") quietly replace id= "`alabel'" in `i'
                else quietly replace id= "`aword'" in `i'
            }

*Graph the results
graph hbar (mean) importance, over(id, sort(1)) ytitle(Importance)

Thanks for any help!

Tags: None

Announcement

Random Forest R squared