Improving model training speed in caret (R)

Tags:

I have a dataset consisting of 20 features and roughly 300,000 observations. I'm using caret to train model with doParallel and four cores. Even training on 10% of my data takes well over eight hours for the methods I've tried (rf, nnet, adabag, svmPoly). I'm resampling with with bootstrapping 3 times and my tuneLength is 5. Is there anything I can do to speed up this agonizingly slow process? Someone suggested using the underlying library can speed up my the process as much as 10x, but before I go down that route I'd like to make sure there is no other alternative.

739

asked Oct 02 '15 01:10

Alexander David

1 Answers

What people forget when comparing the underlying model versus using caret is that caret has a lot of extra stuff going on.

Take for example your randomforest. so bootstrap, number 3, and tuneLength 5. So you resample 3 times, and because of the tuneLength you try to find a good value for mtry. In total you run 15 random forests and comparing these to get the best one for the final model, versus only 1 if you use the basic random forest model.

Also you are running parallel on 4 cores and randomforest needs all the observations available, so all your training observations will be 4 times in memory. Probably not much memory left for training the model.

My advice is to start scaling down to see if you can speed things up, like setting the bootstrap number to 1 and tune length back to the default 3. Or even setting the traincontrol method to "none", just to get an idea on how fast the model is on the minimal settings and no resampling.

answered Nov 15 '22 16:11

phiver

Related questions
                            
                                How to read data with different separators?
                            
                                R-squared on test data
                            
                                R - find and list duplicate rows based on two columns
                            
                                Order data inside a geom_tile
                            
                                Aggregate data in one column based on values in another column
                            
                                Using grep in R to delete rows from a data.frame
                            
                                Removing x-axis label from dendrogram in r
                            
                                R how many element satisfy a condition?
                            
                                Boxplot of table using ggplot2
                            
                                Find consecutive sequence of zeros in R
                            
                                Add a new column between other dataframe columns [duplicate]
                            
                                Formatting of persp3d plot
                            
                                Calculating Time Difference between two columns
                            
                                stringr str_extract capture group capturing everything
                            
                                R: Sample a vector with replacement multiple times
                            
                                Too few periods for decompose() [closed]
                            
                                Removing leading zeros from alphanumeric characters in R
                            
                                How to make gradient color filled timeseries plot in R
                            
                                using leaflet library to output multiple popup values
                            
                                "RTextTools" create_matrix got an error

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Improving model training speed in caret (R)

Tags:

performance

r

machine-learning

r-caret

Alexander David

People also ask

1 Answers

phiver

Recent Activity

Donate For Us