Course Logistics
Online Learning Details
Schedule
Project
Lecture 1
1.
Basics
2.
SSH and Firewall
3.
Setting up Python
4.
Remote Jupyter Server
5.
Recommendation Models
Recommendation (SVD) Training
Recommendation (SVD) Inference
Recommendation (Pytorch) Training
Recommendation (Pytorch) Inference
6.
Serving ML Models Using Web Servers
7.
Flask App
Exercises
Lecture 2
1.
Serverless Deployments
2.
Cloud Functions
3.
GCP Serverless Model Serving
4.
Lambda Functions
5.
AWS Serverless Model Serving
Exercises
Lecture 3
1.
Introduction
2.
Docker
3.
Orchestration using ECS and ECR - Part I
4.
Orchestration using ECS and ECR - Part II
Exercises
Lecture 4
1.
Kubernetes
2.
Model Serving using Kubernetes
3.
Orchestration using GKE
Exercises
Lecture 5
1.
Data Science Workflows
2.
Training Workflows
3.
Cron Jobs
4.
Apache Airflow
Exercises
Lecture 6
1.
Spark based Pipelines
2.
Spark Clusters
3.
Spark on Databricks
4.
PySpark
5.
MLlib - ML Library for Spark
Exercises
Lecture 7
1.
Streaming Workflows
2.
Apache Kafka
3.
Spark Streaming
Exercises
Lecture 8
1.
A/B Testing
2.
Statistical Tests
3.
Testing Models
GitHub repo
Serving ML Models Using Web Servers
Model Serving
How to consume from prediction services?
How to output predictions?
Our Objective
Part 1: Making API Calls
Part 2: Simple Flask App
Serving ML Models Using Web Servers
Model Serving
Sharing results with others (humans, web services, applications)
Batch approach: dump predictions to a database (quite popular)
Real-time approach: send a test feature vector, get back the prediction instantly and the computation happens now
How to consume from prediction services?
Using web requests (e.g., using a JSON payload)
How to output predictions?
We will plan to set up a server to serve predictions
It will respond to web requests (GET, POST)
We pass some inputs (image, text, vector of numbers), and get some outputs (just like a function)
The environment from which we pass inputs may be very different from the environment where the prediction happens (e.g., different hardware)
Our Objective
Use sklearn/keras with flask, gunicorn and heroku to set up a prediction server
Part 1: Making API Calls
Using the
requests
module from a jupyter notebook (this is an example of a programmatic way)
Alternatively, using
curl
or
postman
(these are more versatile)
Part 2: Simple Flask App
Function decorators are used in Flask to achive routes to functions mapping.
Integrating the model with the app is relatively easy if the model can be read from disk.