Originally introduced in a 2017 paper, “Attention Is All You Need” from researchers at Google, the transformer was introduced as an encoder-decoder architecture specifically designed for ...