Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Train neural network with sentences of different length in a QA system

I'm trying to implement a QA system following the instructions showed on this paper. I've correctly imported some datasets and converted the words in vectors with the word2vec method. After the word embedding there is the need to insert the questions and the answers in a CNN. What should be the size of the input Tensor given that each question/answer has a different length? (each question/answer is an array of vectors).

Excerpt from paper:

enter image description here

q_emb is the question after the word embedding and r_w_k is a word vector of length d.

Which is the right value of M (the length of the Q/A) that should be used? Can you please show me some methods to solve this issue or simply give me some help? Thank you

like image 702
Luca Di Liello Avatar asked Oct 16 '22 14:10

Luca Di Liello


1 Answers

Determine the maximum question/answer vector array length and make your input tensor of shape (num_samples, max_qa_length, word_embedding_size). For questions shorter than max_qa_length, pad them with zero vectors at the end.

like image 183
warpri81 Avatar answered Oct 21 '22 03:10

warpri81