What is the positional encoding in the transformer model?

https://datascience.stackexchange.com/questions/51065

encoding
nlp
transformer
attention-mechanism

01-11-2019
|

Question

I'm new to ML and this is my first question here, so sorry if my question is silly.

I'm trying to read and understand the paper Attention is all you need and in it, there is a picture:

I don't know what positional encoding is. by listening to some youtube videos I've found out that it is an embedding having both meaning and position of a word in it and has something to do with $sin(x)$ or $cos(x)$

but I couldn't understand what exactly it is and how exactly it is doing that. so I'm here for some help. thanks in advance.

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange