Home
Reading 11: Self-Attention, Transformers
Self-Attention and Transformers
¶
Readings
•
FCA - Section 8.5.4
•
PML - Section 15.4
•
D2L - Sections 11.1 through 11.6
•
Attention (HdM)
Videos
•
Harvard CS50 (35:40 - 54:15)
3b1b
•
Attention (StatQuest)
•
Matrix Math (StatQuest)
Notebooks
•
Simple Self-Attention
•
Attention (rasbt)
Blogposts
•
Attention (Lilian)
•
Annotated Transformer (Harvard)
Illustrated Transformer (Alammar)
•
Transformers (Lilian)
•
Programming Self-Attention (Raschka)
Papers
•
Attention is All You Need (Vaswani et al.)
•
Transformer Family Tree (Amatriain et al.)
•
BERT Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al.)
•
Improving Language Understanding by Generative Pre-Training (Radford et al.)
Quiz
¶