Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use all features in rpart?

I'm using the rpart package for decision tree classification. I have a data frame with around 4000 features (columns). I want to use all features in rpart() for my model. How can I do that? Basically, rpart() will ask me to use the function in this way:

dt <- rpart(class ~ feature1 + feature2 + ....)

My features are words in documents so I have more than 4k features. Each feature is represented by a word. Is there any possibility to use all features without writing them?

like image 372
user3430235 Avatar asked Sep 23 '14 19:09

user3430235


People also ask

What does rpart () do in R?

Rpart is a powerful machine learning library in R that is used for building classification and regression trees. This library implements recursive partitioning and is very easy to use.

Does rpart use Gini?

The rpart( ) function trains a classification regression decision tree using the Gini index as its class purity metric. Since this algorithm is different from the information entropy computation used in C5.

What is Minbucket in rpart?

From the documentation for the rpart package: minbucket. the minimum number of observations in any terminal node.

What is CP in rpart control?

The complexity parameter (cp) in rpart is the minimum improvement in the model needed at each node. It's based on the cost complexity of the model defined as… For the given tree, add up the misclassification at every terminal node.


1 Answers

I figured it out:

dt <- rpart(class ~ ., data)

"." represents all features.

like image 132
user3430235 Avatar answered Sep 22 '22 01:09

user3430235