Cloud Functions

Intro

  • Cloud Functions (CFs) are a solution from GCP for serverless deployments.
  • Very little boilerplate beyond what we will write for simple offline model inference.
  • In any such deployment, we need to be concerned about:
    • where the model is stored (recall pickle and mlflow), and
    • what python packages are available.

Empty Deployment

  • We will set up triggers that will trigger our serving function (in particular, a HTTP request).
  • We will specify the requirements needed for our python function to work
  • The function we deploy here, similar to lecture 1, produces weather forecasts given a location.

Setting up using UI

  • Sign up with GCP if you haven’t already (typically you get a 300$ credit)

  • Get to the console and find the Cloud Function page.

landing

  • Go through the UI for creating a function.

create

  • We will choose the HTTP trigger and unauthenticated access option.

trigger

  • We may have to enable Cloud Build API

cloudbild

  • Finally, we choose the Python environment. You can see two default example files (main.py and requirements.txt). We will be modifying these two.

python

Python Files and Requirements

  • We will specify the following requirements:

    flask
    geopy
    requests
    
    
  • Our main file is the following:

    def weather(request):
    from flask import jsonify
    from geopy.geocoders import Nominatim
    import requests
    
    data = {"success": False}
    #https://pypi.org/project/geopy/
    geolocator = Nominatim(user_agent="cloud_function_weather_app")
    params = request.get_json()
    if "msg" in params:
        location = geolocator.geocode(str(params['msg']))
        # https://www.weather.gov/documentation/services-web-api
        # Example query: https://api.weather.gov/points/39.7456,-97.0892
        result1 = requests.get(f"https://api.weather.gov/points/{location.latitude},{location.longitude}")
        # Example query: https://api.weather.gov/gridpoints/TOP/31,80
        result2 = requests.get(f"{result1.json()['properties']['forecast']}")
        data["response"] = result2.json()
        data["success"] = True
    return jsonify(data)
    
  • Once the function is deployed, we can test the function (click actions on the far right in the dashboard)

dashboard

  • We can pass the JSON string {"msg":"Chicago"} and see that we indeed get the JSON output for the weather of Chicago.

test

  • We can also access the function from the web endpoint https://us-central1-authentic-realm-276822.cloudfunctions.net/function-1 (you will have a different endpoint). Note that unlike previous times, the request to this endpoint is a JSON payload.

  • Below is the screen-shot of querying the weather of Chicago using the Postman tool. The way to use it is as follows:

    1. Insert the URL of the API
    2. Se the method type to POST
    3. Navigate to body and choose raw and then choose JSON from the dropdown menu.
    4. Now add the relevant parameters as a JSON string.

test

  • Finally, here is a query you can use from a Jupyter notebook.

    
    import requests
    result = requests.post(
        "https://us-central1-authentic-realm-276822.cloudfunctions.net/function-1"
                    ,json = { 'msg': 'Chicago' })
    print(result.json())
    #should match with https://forecast.weather.gov/MapClick.php?textField1=41.98&textField2=-87.9
    

Saving Model on the Cloud

  • For our original task of deploying a trained ML model, we need a way to read it from somewhere when the function is triggered.

  • One way is to dump the model onto Google Cloud Storage (GCS)

  • GCS is similar to the S3 (simple storage service) by AWS.

  • We will use the command line to dump our model onto the cloud.

GCP access via the commandline

  • First we need to install the Google Cloud SDK from https://cloud.google.com/sdk/docs/downloads-interactive

    curl https://sdk.cloud.google.com | bash
    gcloud init
    
  • There are two types of accounts you can work with: a user account or a service account (see https://cloud.google.com/sdk/docs/authorizing?authuser=2).

  • Among others, [this page] gives a brief idea of why such an account is needed. In particular, we will create a service account (so that it can be used by an application programmatically anywhere) and store the encrypted credentials on disk for programmatic access through python. To do so, we run the following commands:

    1. We create the service account and check that it is active by using this command: gcloud iam service-accounts list.

      gcloud iam service-accounts create idsservice \
      --description="IDS service account" \
      --display-name="idsservice-displayed"
      
    2. We then assign a new role for this service account in the project. The account can be disabled using the command gcloud iam service-accounts disable idsservice@authentic-realm-276822.iam.gserviceaccount.com (change idsservice and authentic-realm-276822 to your specific names).

      gcloud projects add-iam-policy-binding authentic-realm-276822     --member=serviceAccount:idsservice@authentic-realm-276822.iam.gserviceaccount.com --role=roles/owner
      
    3. Finally, we can download the credentials

      gcloud iam service-accounts keys create ~/idsservice.json \
      --iam-account idsservice@authentic-realm-276822.iam.gserviceaccount.com
      
    4. Once the credentials are downloaded, they can be programmatically accessed using python running on that machine. We just have to explore the location of the file:

      export GOOGLE_APPLICATION_CREDENTIALS=/Users/theja/idsservice.json
      
  • Next we will install a python module to access GCS, so that we can write our model to the cloud:

    pip install google-cloud-storage
    
  • The following code creates a bucket called theja_model_store

    from google.cloud import storage
    bucket_name = "theja_model_store"
    storage_client = storage.Client()
    storage_client.create_bucket(bucket_name)
    for bucket in storage_client.list_buckets():
    print(bucket.name)
    
  • We can dump the model we used previous here using the following snippet. Here, v1a and v1b are just the new names of our surprise_model and movies.dat files. We could have given the original names (surprise_model and movies.dat) instead of v1a and v1b. These are the two files we are uploading into the theja_model_store bucket. One can upload any files in general, but for our setting, these two (the trained model and the movie metadata file) are of interest. If we need access to the dataset as well to make new predictions, then one may need to upload the dataset as well.

    
    from google.cloud import storage
    bucket_name = "theja_model_store"
    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob("serverless/surprise_model/v1a")
    blob.upload_from_filename("surprise_model")
    blob = bucket.blob("serverless/surprise_model/v1b")
    blob.upload_from_filename("movies.dat")
    
    

storage

After running the above, the surprise package based recommendation model and the helper data file will be available at gs://theja_model_store/serverless/surprise_model/v1a and gs://theja_model_store/serverless/surprise_model/v1b as seen below.

storage_uri

We can either use the URIs above or use a programmatic way with the storage class. For example, here is the way to download the file v1b:

from google.cloud import storage
bucket_name = "theja_model_store"
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob("serverless/surprise_model/v1b")
blob.download_to_filename("movies.dat.from_gcp")

We can diff it in Jupyter notebook itself using the expression !diff movies.dat movies.dat.from_gcp.

We will use this programmatic way of reading external data/model in the cloud function next.