Applied ML consists of ML algorithms at its core and engineering systems around it. For over a decade as an applied ML practitioner, I have built a number of such engineering systems to help unlock the full potential of ML in a variety of problem domains. This talk is about my learnings in building those systems and patterns that I’ve found to work well across applications. … Read More
First proposed by Mitchell et al. in 2018, model cards are a form of transparent reporting of machine learning models, their uses, and performance for public audiences. As part of a broader effort to strengthen our ethical approaches to machine learning at Wikimedia, we started implementing model cards for every model hosted by the Foundation. This talk is a description of our process, motivation, and lessons learned along the way. … Read More
There’s a lot we can learn simply by observing the most successful ML teams in the world: how they operate, which technology stack they use, which skill sets they value, and which processes they implement. In this panel, MLOps thought leaders will come together to share their learnings from speaking with hundreds of leading MLOps teams. They’ll discuss their insights from identifying common patterns between these teams. … Read More
Data and machine learning shape Faire’s marketplace – and as a company that serves small business owners, our primary goal is to increase sales for both brands and retailers using our platform. During this session, we’ll discuss the machine learning and data-related lessons and challenges we’ve encountered over the last 5 years on Faire’s journey to empowering entrepreneurs to chase their dreams. … Read More
All ML teams need to be able to translate offline gains to online performance. Deploying ML models to production is hard. Making sure that those models stay fresh and performant can be even harder. In this talk, we will cover the value of regularly redeploying models, and the failure modes of not doing so. We will discuss approaches to make ML deployment easier, faster and safer which allowed our team to spend more time improving models, and less time shipping them. … Read More
They’re handing us an engine, transmission, breaks, and chassis and asking us to build a fast, safe, and reliable car,” a data scientist at a recently IPO’ed tech company opined, while describing the challenges he faces in delivering ML applications using existing tools and platforms. Although hundreds of new MLOps products have emerged in the past few years, data scientists and ML engineers are still struggling to develop, deploy, and maintain models and systems. In fact, iteration speeds for ML teams may be slowing! In this talk, Sarah Catanzaro, a General Partner at Amplify Partners, will discuss a dominant design for the ML stack, consider why this design inhibits effective model lifecycle management, and identify opportunities to resolve the key challenges that ML practitioners face. … Read More
Each of us has a different answer for “why is machine learning so hard.” And how long you have been working on ML will drastically influence your answer.
I’ll share what I learned over the past 20 years, implementing everything from scratch for 1 model in web search ranking, 100s of models for Sybil and 1000s of models for TFX. You’ll see why I’m convinced that data and software engineering are critical for successful data science – more so than models. Regardless of your experience, I’ll share some tips that will help you overcome the hard parts of machine learning. … Read More
ML is increasingly making its way into production to power customer-facing applications and business processes. This transition from batch to operational ML raises new organizational challenges. Data scientists and engineers now have to work collaboratively as a single team. This requires adaptation on both sides – combining data science and engineering processes into a well-integrated MLOps machine. Our panel of data scientists will provide their perspective on how data engineers can support this transition and more effectively work with data science teams. … Read More
In order to efficiently deliver and maintain ML systems, the adoption of MLOps practices is a must. In recent times, the ML community has embraced and modified ideas originating from software engineering with reasonable success. Software 2.0 (AI/ML) poses some additional challenges that we are still struggling with today. In addition to code, data and models also abide by the continuous principles (Continuous Integration, Delivery and Training). At Volvo Cars, we are embracing a git-centric, declarative approach to ML experimentation and delivery. The adoption of MLOps principles requires cultural transformation alongside supportive infrastructure & tooling that enables efficient development throughout the ML lifecycle. Join us for this session to learn about how Volvo Cars embraces MLOps. … Read More
We’ve all seen the dismal and (at this point, annoying) charts and graphs of ‘>90.x% of ML projects fail’ used as marketing ploys by various companies. What this largely simplified view of ML project success rates buries in misleading abstraction is the fact that some companies have a 100% success rate with long-running ML projects while others have a 0% success rate.
This talk is intending to go through a simple concept that is obvious to the 100% success rate companies but is a mystery to those that fail time and again. Firstly, that a project is not an island. It has dependencies on other teams (both technical and non-technical), that the DS team doesn’t need to be heroic in pursuing the most complex solution, and how establishing solid engineering practices is what will set apart the projects that will succeed and those that will fail.
The main points that will be covered:
- Can you really solve this with ML? Should you?
- Make sure you have the data consistently and that’s it’s not garbage (feature stores are great!)
- Start simple and only add complexity if you need to
- Involve the business (SMEs)
- Build code that your team can maintain and test
- Monitor your data and predictions so you know when things are about to break
There’s often a push for data engineers and data scientists to adopt every pattern that software engineers use. But adopting things that are successful in one domain without understanding how it applies to another domain can lead to “cargo cult” type behavior. There are fundamental differences why working with data may require different workflows and systems, and that’s OK! … Read More
When displaying relevant first-party ads to buyers in the Etsy marketplace, ads are ranked using a combination of outputs from ML models. The relevance of ads displayed to buyers and costs charged to sellers are highly sensitive to the output distributions of the models. Various factors contribute to model outputs which include the makeup of training data, model architecture, and input features. To make the system more robust and resilient to modeling changes, we have calibrated all ML models that power ranking and bidding.
In this talk, we will first discuss the pain points and use cases that identified the need for calibration in our system. We will share the journey, learnings, and challenges of calibrating our machine learning models and the implications of calibrated outputs. Finally, we will explain how we are using the calibrated outputs in downstream applications and explore opportunities that calibration unlocks at Etsy. … Read More