Data Science in Production by Ben Weber (2020, $5 for the ebook/pdf). A sample of the first three chapters is available at the publishers page linked here.
Lecture Schedule
Lecture 1: Serving ML Models Using Web Servers
Reference: Chapter 2
Learning Goals:
Be able to set up a Python environment
Be able to set up a jupyter session with SSH tunneling
Be able to secure a web server
Be able to use Flask to serve a ML model
Lecture 2: Serving ML Models Using Serverless Infrastructure
Reference: Chapter 3
Learning Goals:
Be able to differentiate hosted vs managed solutions
Assess deops effort for web server vs serverless deployments
Be able to deploy a ML model using Google Cloud Functions and AWS Lambda Functions
Lecture 3: Serving ML Models Using Docker
Reference: Chapter 4, upto 4.2
Learning Goals:
Be able to reason the pros and cons of container technologies
Be able to differentiate containers from virtual machines
Be able to create a new Docker image using Dockerfile
Be able to upload the image to a remote registry
Lecture 4: Kubernetes for Orchestrating ML Deployments
Reference: Chapter 4, 4.3 onwards
Learning Goals:
Understand the uses of Kubernetes
Be able to set up a single node Kubernetes cluster using kubectl and minicube
Be able to serve a prediction model on a container in the Kuebernetes cluster
Be able to deploy a prediction model on Google Kubernetes Engine (GKE)
Lecture 5: ML Model Pipelines
Reference: Chapter 5
Learning Goals:
Learn how to manage a model building workflow
Learn how to set up automated jobs using cron
Learn the basics of Apache Airflow
Learn a managed workflow tool (Google Cloud Composer)
Lecture 6: PySpark Ecosystem
Reference: Chapter 6
Learning Goals:
Understand the components on a spark cluster
Be able to use PySpark and spark dataframes
Be able to use models from MLLib
Be able to work with a managed solution such as Databricks
Lecture 7: Streaming Model Deployments
Reference: Chapter 8
Learning Goals:
Understand the difference between a streaming model deployment workflow vs a batch model deployment workflow
Learn the basics of streaming with Apache Kafka
Be able to differentiate between a batch Pyspark workflow and a Pyspark streaming workflow