Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why I'm getting bad result with Keras vs random forest or knn?

I'm learning deep learning with keras and trying to compare the results (accuracy) with machine learning algorithms (sklearn) (i.e random forest, k_neighbors)

It seems that with keras I'm getting the worst results. I'm working on simple classification problem: iris dataset My keras code looks:

samples = datasets.load_iris()
X = samples.data
y = samples.target
df = pd.DataFrame(data=X)
df.columns = samples.feature_names
df['Target'] = y

# prepare data
X = df[df.columns[:-1]]
y = df[df.columns[-1]]

# hot encoding
encoder = LabelEncoder()
y1 = encoder.fit_transform(y)
y = pd.get_dummies(y1).values

# split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3)

# build model
model = Sequential()
model.add(Dense(1000, activation='tanh', input_shape = ((df.shape[1]-1),)))
model.add(Dense(500, activation='tanh'))
model.add(Dense(250, activation='tanh'))
model.add(Dense(125, activation='tanh'))
model.add(Dense(64, activation='tanh'))
model.add(Dense(32, activation='tanh'))
model.add(Dense(9, activation='tanh'))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train)
score, acc = model.evaluate(X_test, y_test, verbose=0)

#results:
#score = 0.77
#acc = 0.711

I have tired to add layers and/or change number of units per layer and/or change the activation function (to relu) by it seems that the result are not higher than 0.85.

With sklearn random forest or k_neighbors I'm getting result (on same dataset) above 0.95.

  1. What am I missing ?

  2. With sklearn I did little effort and got good results, and with keras, I had a lot of upgrades but not as good as sklearn results. why is that ?

  3. How can I get same results with keras ?

like image 245
Boom Avatar asked Apr 15 '20 16:04

Boom


2 Answers

Adding some dropout might help you improve accuracy. See Tensorflow's documentation for more information.

Essentially how you add a Dropout layer is just very similar to how you added those Dense() layers.

model.add(Dropout(0.2)

Note: The parameter '0.2 implies that 20% of the connections in the layer is randomly omitted to reduce the interdependencies between them, which reduces overfitting.

like image 27
Cheo Kee Jin Avatar answered Dec 20 '22 17:12

Cheo Kee Jin


In short, you need:

  1. ReLU activations
  2. Simpler model
  3. Data mormalization
  4. More epochs

In detail:

The first issue here is that nowadays we never use activation='tanh' for the intermediate network layers. In such problems, we practically always use activation='relu'.

The second issue is that you have build quite a large Keras model, and it might very well be the case that with only 100 iris samples in your training set you have too few data to effectively train such a large model. Try reducing drastically both the number of layers and the number of nodes per layer. Start simpler.

Large neural networks really thrive when we have lots of data, but in cases of small datasets, like here, their expressiveness and flexibility may become a liability instead, compared with simpler algorithms, like RF or k-nn.

The third issue is that, in contrast to tree-based models, like Random Forests, neural networks generally require normalizing the data, which you don't do. Truth is that knn also requires normalized data, but in this special case, since all iris features are in the same scale, it does not affect the performance negatively.

Last but not least, you seem to run your Keras model for only one epoch (the default value if you don't specify anything in model.fit); this is somewhat equivalent to building a random forest with a single tree (which, BTW, is still much better than a single decision tree).

All in all, with the following changes in your code:

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

model = Sequential()
model.add(Dense(150, activation='relu', input_shape = ((df.shape[1]-1),)))
model.add(Dense(150, activation='relu'))
model.add(Dense(y.shape[1], activation='softmax'))

model.fit(X_train, y_train, epochs=100)

and everything else as is, we get:

score, acc = model.evaluate(X_test, y_test, verbose=0)
acc
# 0.9333333373069763

We can do better: use slightly more training data and stratify them, i.e.

X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    test_size = 0.20, # a few more samples for training
                                                    stratify=y)

And with the same model & training epochs you can get a perfect accuracy of 1.0 in the test set:

score, acc = model.evaluate(X_test, y_test, verbose=0)
acc
# 1.0

(Details might differ due to some randomness imposed by default in such experiments).

like image 188
desertnaut Avatar answered Dec 20 '22 17:12

desertnaut