What is the position embedding in the convolutional sequence to sequence learning model?

Question

I don't understand the position embedding in paper Convolutional Sequence to Sequence Learning, anyone can help me?

khaemuaset · Accepted Answer

From what I understand, for each word to translate, the input contains both the word itself and its position in the input chain (say, 0, 1, ...m).

Now, encoding such a data with simply having a cell with value pos (in 0..m) would not perform very well (for the same reason we use one-hot vectors to encode words). So, basically, the position will be encoded in a number of input cells, with one-hot representation (or similar, I might think of a binary representation of the position being used).

Then, an embedding layer will be used (just as it is used for word encodings) to transform this sparse and discrete representation into a continuous one.

The representation used in the paper chose to have the same dimension for the word embedding and the position embedding and to simply sum up the two.

What is the position embedding in the convolutional sequence to sequence learning model?

Tags:

deep-learning

高源伯

1 Answers

khaemuaset

Recent Activity

Donate For Us

What is the position embedding in the convolutional sequence to sequence learning model?

Tags:

deep-learning

高源伯

1 Answers

khaemuaset

Related questions

Recent Activity

Donate For Us