Hi - I am using Stata 18 on Mac OS 13.4.1 and have a problem with the machine learning routine c_ml_stata_cv.
I am trying to run the iris random forest example in Stata Journal, volume 22, number 4: pr0076 -- while I can get a result using the "default" option, when I use cross_validation with tree_depth(2 4 6 8) I am getting Python errors telling me that it is expecting integers but is getting non-integer values 2.0, 4.0, 6.0, 8.0
Part of the error message:
sklearn.utils._param_validation.InvalidParameterEr ror: The 'max_depth' parameter of RandomForestClassifier must be an int in the range [1, inf) or None. Got 2.0 instead.
The command I am running:
c_ml_stata_cv $y $X, mlmodel(randomforest) data_test("iris_test") ///
n_estimators(50 100 150) tree_depth(2 4 6 8) max_features(3 6) ///
prediction("pred") cross_validation("cv") n_folds(5) seed(10)
Any ideas out there as to why my integers are morphing into non-integers?
Many thanks, Martin
I am trying to run the iris random forest example in Stata Journal, volume 22, number 4: pr0076 -- while I can get a result using the "default" option, when I use cross_validation with tree_depth(2 4 6 8) I am getting Python errors telling me that it is expecting integers but is getting non-integer values 2.0, 4.0, 6.0, 8.0
Part of the error message:
sklearn.utils._param_validation.InvalidParameterEr ror: The 'max_depth' parameter of RandomForestClassifier must be an int in the range [1, inf) or None. Got 2.0 instead.
The command I am running:
c_ml_stata_cv $y $X, mlmodel(randomforest) data_test("iris_test") ///
n_estimators(50 100 150) tree_depth(2 4 6 8) max_features(3 6) ///
prediction("pred") cross_validation("cv") n_folds(5) seed(10)
Any ideas out there as to why my integers are morphing into non-integers?
Many thanks, Martin