L11 : Advanced NLP: Attention, BERT and Transformers

Lecture Goals

  • Be able to explain self-attention and how it differs from simpler attention mechanisms seen in sequence to sequence models
  • Be able to reason about keys, values and queries in self-attention
  • Be able to recall the key characteristics of BERT and how pre-trained models can be used for NLP tasks.