Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use Keras Embedding layer when there are more than 1 text features

I understand how to use the Keras Embedding layer in case there is a single text feature like in IMDB review classification. However, I am confused how to use the Embedding Layers when I have a Classification problem, where there are more than a single text feature. For example, I have a dataset with 2 text features Diagnosis Text, and Requested Procedure and the label is binary class (1 for approved, 0 for not approved). In the example below, x_train has 2 columns Diagnosis and Procedure, unlike the IMDB dataset. Do I need to create 2 Embedding layers, one for Diagnosis, and Procedure? If so, what code changes would be required?

x_train = preprocessing.sequences.pad_sequences(x_train, maxlen=20)
x_test = preprocessing.sequences.pad_sequences(x_test, maxlen=20)
model = Sequential()
model.add(Embedding(10000,8,input_length=20)
model.add(Flatten())
model.add(Dense(1, activation='sigmoid')
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)
like image 556
user2202866 Avatar asked Apr 02 '18 05:04

user2202866


People also ask

How does the embedding layer in keras work?

What is the embedding layer in Keras? Keras provides an embedding layer that converts each word into a fixed-length vector of defined size. The one-hot-encoding technique generates a large sparse matrix to represent a single word, whereas, in embedding layers, every word has a real-valued vector of fixed length.

Is an embedding layer just a dense layer?

While a Dense layer considers W as an actual weight matrix, an Embedding layer considers W as a simple lookup table. Moreover, it considers each integer in the Input array as the index where to find the relative weights in the lookup table.

Which of the following function in keras is used to add the embedding layer to the model?

Again, we can do this with a built in Keras function, in this case the pad_sequences() function. We are now ready to define our Embedding layer as part of our neural network model. The Embedding has a vocabulary of 50 and an input length of 4.

Is embedding layer in keras trainable?

Embedding layer is one of the available layers in Keras. This is mainly used in Natural Language Processing related applications such as language modeling, but it can also be used with other tasks that involve neural networks. While dealing with NLP problems, we can use pre-trained word embeddings such as GloVe.


1 Answers

You have some choices, you could concatenate the the two features into one and create a single embedding for both of them. Here is the logic

all_features = np.hstack(X['diag'] + X['proc'])
X = pad_sequence(all_features, max_len)
# build model as usual, as you can see on a single embedding layer is
# needed.

or you can use the Functional api and build multiple input model

diag_inp = Input()
diag_emb = Embedding(512)(diag_input)
proc_inp = Input()
proc_emb = Embedding(512)(proc_input)

# concatenate them to makes a single vector per sample
merged = Concatenate()[diag_emb, proc_emb]
out = Dense(2,  activation='sigmoid')(merged)
model = Model(inputs=[diag_inp, proc_inp], outputs=[out])

That is you can learn an embedding for the concatenation or you can learn multiple embeddings and concatenate them while training.

like image 183
parsethis Avatar answered Jan 02 '23 18:01

parsethis