Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • h2o changes data types?

    Hello everyone,

    I am trying to run a simple prediction model and save the result in the original dataframe. When first running the regression, it works, yet after accessing the h2o and predicting the class probabilities, I instead receive a "no observations r(2000)". A quick look tells me that the variables still exist and that a probability column was added. The code itself is only a slight modification of the example code.

    ----
    h2o clear
    use "$temp\Pre_reg", clear
    gen d_time = (time_tuli > 90)
    label define dtime 0 "Short Stay" 1 "Long Stay"
    label values d_time dtime

    reg age pre_20

    _h2oframe put, into(auto)
    _h2oframe change auto

    _h2oframe split auto, into(train valid) split(0.8 0.2) rseed(19)
    _h2oframe change train

    global predictors sexe age
    h2oml rfbinclass d_time $predictors, ///
    cv(2) h2orseed(123) ///
    ntrees(100) maxdepth(6) ///
    balanceclasses minobsleaf(5)

    _h2oframe change auto
    h2omlpredict foreignhat_tpr, pr
    clear
    _h2oframe get auto

    reg age pre_20
    ----

    Thanks in advance for your help!

    Best,
    Gregor

  • #2
    Hi Gregor Lauth,
    Is either of the variables age or pre_20 of the H2O data type enum? You can check the data types by using the command
    . _h2oframe describe
    For definitions of data types in H2O, please see https://www.stata.com/h2o/h2o19/h2of...ame-intro-what
    If this is the case, the error after regress is likely due to Stata having difficulty interpreting the H2O enum type and converting it to a string. You can use the Stata command encode to convert this variable back to a numeric.

    I hope this helps.

    Comment

    Working...
    X