Text classification using Keras: How to add custom features?

Tags:

I'm writing a program to classify texts into a few classes. Right now, the program loads the train and test samples of word indices, applies an embedding layer and a convolutional layer, and classifies them into the classes. I'm trying to add handcrafted features for experimentation, as in the following code. The features is a list of two elements, where the first element consists of features for the training data, and the second consists of features for the test data. Each training/test sample will have a corresponding feature vector (i.e. the features are not word features).

model = Sequential()
model.add(Embedding(params.nb_words,
                    params.embedding_dims,
                    weights=[embedding_matrix],
                    input_length=params.maxlen,
                    trainable=params.trainable))
model.add(Convolution1D(nb_filter=params.nb_filter,
                        filter_length=params.filter_length,
                        border_mode='valid',
                        activation='relu'))
model.add(Dropout(params.dropout_rate))
model.add(GlobalMaxPooling1D())

# Adding hand-picked features
model_features = Sequential()
nb_features = len(features[0][0])

model_features.add(Dense(1,
                         input_shape=(nb_features,),
                         init='uniform',
                         activation='relu'))

model_final = Sequential()
model_final.add(Merge([model, model_features], mode='concat'))

model_final.add(Dense(len(citfunc.funcs), activation='softmax'))
model_final.compile(loss='categorical_crossentropy',
                    optimizer='adam',
                    metrics=['accuracy'])

print model_final.summary()
model_final.fit([x_train, features[0]], y_train,
                nb_epoch=params.nb_epoch,
                batch_size=params.batch_size,
                class_weight=data.get_class_weights(x_train, y_train))

y_pred = model_final.predict([x_test, features[1]])

My question is, is this code correct? Is there any conventional way of adding features to each of the text sequences?

411

asked Mar 27 '17 07:03

hsiaomijiou

1 Answers

Try:

input = Input(shape=(params.maxlen,))
embedding = Embedding(params.nb_words,
                    params.embedding_dims,
                    weights=[embedding_matrix],
                    input_length=params.maxlen,
                    trainable=params.trainable)(input)
conv = Convolution1D(nb_filter=params.nb_filter,
                        filter_length=params.filter_length,
                        border_mode='valid',
                        activation='relu')(embedding)
drop = Dropout(params.dropout_rate)(conv)
seq_features = GlobalMaxPooling1D()(drop)

# Adding hand-picked features
nb_features = len(features[0][0])
other_features = Input(shape=(nb_features,))

model_final = merge([seq_features , other_features], mode='concat'))

model_final = Dense(len(citfunc.funcs), activation='softmax'))(model_final)

model_final = Model([input, other_features], model_final)

model_final.compile(loss='categorical_crossentropy',
                    optimizer='adam',
                    metrics=['accuracy'])

In this case - you are merging features from a sequence analysis with custom features directly - without squashing all custom features to 1 features using Dense.

170

answered Sep 28 '22 08:09

Marcin Możejko

Related questions
                            
                                Using ROC AUC score with Logistic Regression and Iris Dataset
                            
                                Default Adam optimizer doesn't work in tf.keras but string `adam` does
                            
                                What are some pagerank alternatives?
                            
                                Which classification algorithm can be used for document categorization?
                            
                                Build an approximately uniform grid from random sample (python)
                            
                                Least squares linear classifier in matlab
                            
                                R: unclear behaviour of tuneRF function (randomForest package)
                            
                                Apache Spark ALS Recommendation Rating values higher than range
                            
                                Torch Resize Tensor
                            
                                Machine learning in Clojure
                            
                                Encoding String to numbers so as to use it in scikit-learn
                            
                                scikit-learn: get selected features when using SelectKBest within pipeline
                            
                                How to write a custom pooling layer module in tensor flow?
                            
                                How to implement multivariate linear stochastic gradient descent algorithm in tensorflow?
                            
                                Vectorization: Not a valid collection
                            
                                How can you remove only the interaction terms in a polynomial regression using scikit-learn?
                            
                                How is the gradient and hessian of logarithmic loss computed in the custom objective function example script in xgboost's github repository?
                            
                                Leaky_Relu in Caffe
                            
                                Decision tree using continuous variable [closed]
                            
                                Python - calculate the co-occurrence matrix

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Text classification using Keras: How to add custom features?

Tags:

machine-learning

neural-network

deep-learning

nlp

keras

hsiaomijiou

People also ask

1 Answers

Marcin Możejko

Recent Activity

Donate For Us