L07 : Online Learning: A/B Testing, Multi-armed Bandits, Contextual Bandits

Lecture Goals

  • What is online learning? How is it different from supervised learning?
  • Relation between forecasting and decision making
  • The multi armed bandit problem and solutions
  • Contextual bandits