What is the positional encoding in the transformer model?
-
01-11-2019 - |
Question
I'm new to ML and this is my first question here, so sorry if my question is silly.
I'm trying to read and understand the paper Attention is all you need and in it, there is a picture:
I don't know what positional encoding is. by listening to some youtube videos I've found out that it is an embedding having both meaning and position of a word in it and has something to do with $sin(x)$ or $cos(x)$
but I couldn't understand what exactly it is and how exactly it is doing that. so I'm here for some help. thanks in advance.
No correct solution
Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange