Schedule :: MLOps: Operationalizing Machine Learning

Schedule

Textbook

Data Science in Production by Ben Weber (2020, $5 for the ebook/pdf). A sample of the first three chapters is available at the publishers page linked here.

Lecture Schedule

Lecture 1: Serving ML Models Using Web Servers

Reference: Chapter 2

Learning Goals:
- Be able to set up a Python environment
- Be able to set up a jupyter session with SSH tunneling
- Be able to secure a web server
- Be able to use Flask to serve a ML model

Lecture 2: Serving ML Models Using Serverless Infrastructure

Reference: Chapter 3

Learning Goals:
- Be able to differentiate hosted vs managed solutions
- Assess deops effort for web server vs serverless deployments
- Be able to deploy a ML model using Google Cloud Functions and AWS Lambda Functions

Lecture 3: Serving ML Models Using Docker

Reference: Chapter 4, upto 4.2

Learning Goals:
- Be able to reason the pros and cons of container technologies
- Be able to differentiate containers from virtual machines
- Be able to create a new Docker image using Dockerfile
- Be able to upload the image to a remote registry

Lecture 4: Kubernetes for Orchestrating ML Deployments

Reference: Chapter 4, 4.3 onwards

Learning Goals:
- Understand the uses of Kubernetes
- Be able to set up a single node Kubernetes cluster using kubectl and minicube
- Be able to serve a prediction model on a container in the Kuebernetes cluster
- Be able to deploy a prediction model on Google Kubernetes Engine (GKE)

Lecture 5: ML Model Pipelines

Reference: Chapter 5

Learning Goals:
- Learn how to manage a model building workflow
- Learn how to set up automated jobs using cron
- Learn the basics of Apache Airflow
- Learn a managed workflow tool (Google Cloud Composer)

Lecture 6: PySpark Ecosystem

Reference: Chapter 6

Learning Goals:
- Understand the components on a spark cluster
- Be able to use PySpark and spark dataframes
- Be able to use models from MLLib
- Be able to work with a managed solution such as Databricks

Lecture 7: Streaming Model Deployments

Reference: Chapter 8

Learning Goals:
- Understand the difference between a streaming model deployment workflow vs a batch model deployment workflow
- Learn the basics of streaming with Apache Kafka
- Be able to differentiate between a batch Pyspark workflow and a Pyspark streaming workflow

Lecture 8: Online Experimentation

Reference: https://docs.aws.amazon.com/sagemaker/latest/dg/model-ab-testing.html

Learning Goals:
- Know the considerations for A/B testing of models before full rollouts
- Be acquainted with a few statistical hypothesis tests and how sample sizes are determined
- Be able to create simple experiments using planout and a flask based deployment setup