Workshop: Bring Your Models to Production with Ray Serve | Tecton

Tecton

Workshop: Bring Your Models to Production with Ray Serve

apply(conf) - May '22 - 60 minutes

In this workshop, we will walk through a step-by-step guide on how to deploy an ML application with Ray Serve. Compared to building your own model servers with Flask and FastAPI, Ray Serve facilitates seamless building and scaling to multiple models and serving model nodes in a Ray Cluster.

Ray Serve supports inference on CPUs, GPUs (even fractional GPUs!), and other accelerators – using just Python code. In addition to single-node serving, Serve enables seamless multi-model inference pipelines (also known as model composition); autoscaling in Kubernetes, both locally and in the cloud; and integrations between business logic and machine learning model code.

We will also share how to integrate your model serving system with feature stores and operationalize your end-to-end ML application on Ray.

Shreyas Krishnaswamy

Software Engineer

Anyscale

Shreyas Krishnaswamy is a software engineer focusing on Ray Serve and Ray infrastructure at Anyscale.
Simon Mo

Software Engineer

Anyscale

Simon Mo is a software engineer working on Ray Serve at Anyscale. He focuses on studying and building systems for machine learning, in particular, how to make ML model serving systems more efficient, ergonomic, and scalable.

© Tecton, Inc. All rights reserved. Various trademarks held by their respective owners.

Request a free trial

Interested in trying Tecton? Leave us your information below and we’ll be in touch.​

Tell us a bit more...​

Interested in trying Tecton? Leave us your information below and we’ll be in touch.​

Request a free trial

Interested in trying Tecton? Leave us your information below and we’ll be in touch.​