Transformer-XL is also greater than 1800x faster than vanilla Transformers “during evaluation on language modeling tasks, because no re-computation is needed.” Ian C. Schafer is a multimedia ...