I've been using gbm through caret without problems, but when removing some variables from my dataframe it started to fail. I've tried with both github and cran versions of the mentioned packages.
This is the error:
> fitRF = train(my_data[trainIndex,vars_for_clust], clusterAssignment[trainIndex], method = "gbm", verbose=T)
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :9 NA's :9
Error in train.default(my_data[trainIndex, vars_for_clust], clusterAssignment[trainIndex], :
Stopping
In addition: There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In eval(expr, envir, enclos) :
model fit failed for Resample01: shrinkage=0.1, interaction.depth=1, n.minobsinnode=10, n.trees=150 Error in gbm.fit(x = structure(list(relatedness_cottle = c(0, 0, 8, 6, :
unused arguments (x = list(relatedness_cottle = c(0, 0, 8, 6, 0, 6, 8, 10, 10, 6, 6, 4, 4, 4, 0, 0, 0, 0, 18, 18, 18, 0, 0, 6, 6, 0, 18, 12, 0, 4, 4, 4, 0, 0, 0, 18, 18, 6, 4, 4, 4, 6, 8, 6, 6, 0, 14, 2, 0, 8, 6, 6, 0, 4, 0, 0, 0, 0, 0, 4, 8, 8, 8, 4, 18, 0, 0, 4, 10, 18, 6, 0, 0, 18, 10, 10, 6, 2, 4, 4, 10, 10, 10, 2, 8, 0, 0, 0, 0, 10, 6, 6, 0, 4, 4, 0, 0, 0, 0, 8, 0, 0, 4, 4, 6, 6, 10, 6, 0, 0, 6, 4, 4, 8, 0, 12, 6, 2, 2, 8, 8, 4, 4, 4, 4, 6, 2, 2, 4, 0, 6, 0, 0, 0, 12, 18, 8, 0, 0, 4, 4, 2, 0, 0, 0, 0, 18,
12, 6, 6, 4, 4, 12, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 18, 0, 0, 18, 6, 4, 2, 2, 0, 0, 10, 0, 0, 0, 12, 4, 4, 4, 4, 4, 8, 18, 6, 18, 18, 12, 12, 12, 0, 0, 0, 0, 10, 12, 12, 12, 12, 12, 4, 4, 4, 6, 6, 6, 6, 12, 0, 6, 0, 0, 4, 4, 18, 18, 18, 0, 0, 4, 6, 6, 0, 0, 2, 0, 0, 0, 18, 12, 12, 0, 0, 0, 0, 0, 0, 18 [... truncated]
There are no missing values, the response is a 4 level factor and inputs are the following:
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 1165 obs. of 14 variables:
$ relatedness_cottle : num 0 0 8 8 0 6 0 6 6 0 ...
$ dominance_cottle : int 4 6 0 6 6 6 6 4 4 4 ...
$ time_spent : num 26832 20822 18893 13107 25406 ...
$ num_color_changes : num 3.33 2.33 1.33 1 1 ...
$ num_selects : num 1 0.667 2 0.667 1.667 ...
$ show_select_match : num 1 0.667 0.333 1 1 ...
$ default_size : num 0.667 0 0.667 0 0 ...
$ select_order : Factor w/ 6 levels "future_past_present",..: 1 4 4 2 5 1 4 6 6 4 ...
$ order_x : Factor w/ 6 levels "future_past_present",..: 4 4 4 4 4 3 4 4 4 4 ...
$ color_past : Factor w/ 8 levels "black","blue",..: 5 1 6 8 5 7 1 6 6 5 ...
$ color_present : Factor w/ 8 levels "black","blue",..: 1 4 4 4 6 8 4 4 1 4 ...
$ color_future : Factor w/ 8 levels "black","blue",..: 2 2 2 2 2 2 1 2 8 2 ...
$ dominance_cottle_future : int 0 4 0 4 2 0 4 2 2 0 ...
$ relatedness_cottle_future: int 0 2 4 4 0 4 0 2 4 0 ...
But if I call gbm directly with the dataframe, it works:
summary(gbm(clusterAssignment[trainIndex] ~ ., data = my_data[trainIndex,vars_for_clust]))
Distribution not specified, assuming multinomial ...
var rel.inf
color_present color_present 33.533673
dominance_cottle dominance_cottle 33.170138
default_size default_size 25.321566
dominance_cottle_future dominance_cottle_future 5.674563
color_future color_future 2.300060
relatedness_cottle relatedness_cottle 0.000000
time_spent time_spent 0.000000
num_color_changes num_color_changes 0.000000
num_selects num_selects 0.000000
show_select_match show_select_match 0.000000
select_order select_order 0.000000
order_x order_x 0.000000
color_past color_past 0.000000
relatedness_cottle_future relatedness_cottle_future 0.000000
Edit: to reproduce, run the script found here.
For now, casting a dataframe from plyr/dplyr to a normal dataframe with as.data.frame()
fixes the problem.
train(as.data.frame(issueDataframe), issueResponse, method="gbm")
See this issue.
same problem with the glm method. Solved when I remove the VERBOSE option...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With