Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the position embedding in the convolutional sequence to sequence learning model?

I don't understand the position embedding in paper Convolutional Sequence to Sequence Learning, anyone can help me?

like image 202
高源伯 Avatar asked Jun 18 '17 11:06

高源伯


1 Answers

From what I understand, for each word to translate, the input contains both the word itself and its position in the input chain (say, 0, 1, ...m).

Now, encoding such a data with simply having a cell with value pos (in 0..m) would not perform very well (for the same reason we use one-hot vectors to encode words). So, basically, the position will be encoded in a number of input cells, with one-hot representation (or similar, I might think of a binary representation of the position being used).

Then, an embedding layer will be used (just as it is used for word encodings) to transform this sparse and discrete representation into a continuous one.

The representation used in the paper chose to have the same dimension for the word embedding and the position embedding and to simply sum up the two.

like image 154
khaemuaset Avatar answered Nov 10 '22 02:11

khaemuaset