
Shreyas Krishnaswamy
Software Engineer
Anyscale
Home / Learn / apply() Conference /
apply(conf) - May '22 - 60 minutes
In this workshop, we will walk through a step-by-step guide on how to deploy an ML application with Ray Serve. Compared to building your own model servers with Flask and FastAPI, Ray Serve facilitates seamless building and scaling to multiple models and serving model nodes in a Ray Cluster.
Ray Serve supports inference on CPUs, GPUs (even fractional GPUs!), and other accelerators – using just Python code. In addition to single-node serving, Serve enables seamless multi-model inference pipelines (also known as model composition); autoscaling in Kubernetes, both locally and in the cloud; and integrations between business logic and machine learning model code.
We will also share how to integrate your model serving system with feature stores and operationalize your end-to-end ML application on Ray.
© Tecton, Inc. All rights reserved. Various trademarks held by their respective owners.
The Gartner Cool Vendor badge is a trademark and service mark of Gartner, Inc., and/or its affiliates, and is used herein with permission. All rights reserved.
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
Interested in trying Tecton? Leave us your information below and we’ll be in touch.
Interested in trying Tecton? Leave us your information below and we’ll be in touch.