I'm trying to implement a QA system following the instructions showed on this paper. I've correctly imported some datasets and converted the words in vectors with the word2vec method. After the word embedding there is the need to insert the questions and the answers in a CNN. What should be the size of the input Tensor given that each question/answer has a different length? (each question/answer is an array of vectors).
Excerpt from paper:
q_emb is the question after the word embedding and r_w_k is a word vector of length d.
Which is the right value of M (the length of the Q/A) that should be used? Can you please show me some methods to solve this issue or simply give me some help? Thank you
Determine the maximum question/answer vector array length and make your input tensor of shape (num_samples, max_qa_length, word_embedding_size)
. For questions shorter than max_qa_length
, pad them with zero vectors at the end.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With