Aim

  • The aim of the project is to simulate the real-world process of deploying machine learning models. More specifically, the project component of this course allows you to explore a technology that assists in model deployment, either directly or indirectly, and asks you to report your experience working with that technology (or multiple technologies) to achieve some overall deployment goal.

Group

  • You should form groups of 4 students for this project component (this is a strict requirement). Reach out to your classmates early. Because this is a group project, a commensurate effort is expected, and each members contributions needs to be reported in the final submission.

Project Outcomes

  • There is one due-date for the project deliverables. See the course logistics page for the exact date. The deliverables are as follows.
  • Project Report: In at most 8 pages (12 point, single column; you can have an appendix for supplementary material that may or may not be checked), you should explain your contributions in the project.
  • Code and data: Code associated with the project (e.g., Jupyter notebooks), a small sample of the data/inputs/outputs if needed, and all steps necessary to replicate your project should be provided along with/in the report. A link to your github/gitlab/bitbucket/other repository is acceptable here (provide it at the front page of the report).
  • A video presentation: You should provide a 10 minute video walk-through (discussing highlights) of your project and provide the link (say from Youtube where the video can be in unlisted mode) on the front page of the report.

Each team should upload the report (and code and video link) to Blackboard before the deadline.

Example Report Components

  • For example, here are some aspects to focus on in your project report:
    • what was the goal
    • what were the possible solutions
    • what were the specific pros and cons from a business point of view
    • a cost benefit analysis
    • actual handling of the technology and demonstration in a dev environment
    • documenting the experience
    • lessons learned
    • code artifacts and/or Jupyter notebooks
  • Here is an example project idea: try out a technology (or a specific aspect of it) and its competitors by following their documentation in a very extensive and well thought out manner (e.g., MLFlow vs bentoml vs cortex).

Grading Rubric

  • Projects will be graded based on the creativity shown in handling the technology and the insights drawn. The reports should be very clearly written and presented, and will be evaluated based on the correctness, content, creativity and clarity:
    • Correctness will be assessed based on the correct application of a technology, valid software setup and discussion of choices, technical correctness and the assumptions laid out.
    • Content will be assessed based on the contributions made in the project (given group size) and project depth (e.g., why this aspect of ML deployment, why this problem, what did you do, visualization and interesting conclusions, insights, discussion of methodology followed). You should try to demonstrate your understanding of the relevant topics and their use in your non-trivial project.
    • Creativity will be assessed based on how no-obvious your solution or contribution is and how different choices were thoughtfully made in the execution of the project.
    • Clarity will be assessed based on the language quality, layout and structure of the report, the adequacy of the references cited, the capability of the team in explaining ideas in a clear and professional manner, and the clarity demonstrated in your discussions etc.
  • All external material/sources (code/idea/theory/insights) used should be cited prominently without failure. Use of pre-trained models, databases, web servers, front-end frameworks, visualization tools etc for your project is allowed and encouraged. This project cannot be used as part of any other course or requirement.

Additional Pointers

  • Keep track of costs especially if you are using services that require having a payment mode on file. Also, try to use free resources as much as possible.
  • Do not train deep networks from scratch if it can be avoided. The project should not be centered around model accuracy.
  • It is importantly to make a project plan that allocates sufficient tasks for each team member. It will be great if you can submit the project plan (a Gantt chart for example).