An increasing number of companies across all industries are adoption real-time machine learning for use cases like dynamic pricing, fraud detection, recommender systems, and more. However, for many companies, making to move to real-time ML is a big challenge because it requires new processes and tooling. Not only that, but they also need to ensure their tech stack is ready for real-time ML before they attempt to start the transition.
In a recent webinar, Claypot AI CEO Chip Huyen joined Tecton CTO Kevin Stumpf to discuss the costs and challenges of deploying real-time ML. They also answered live questions from the audience about real-time ML—we’ve selected a few to highlight in this post.
Q: There may be errors or noise in real-time data. How can I solve this when making an online prediction with real-time ML?
You need to do more than just monitor your models at prediction time. You also need to monitor your features to prevent a “garbage in, garbage out” situation. However, it’s tough to detect problems with the data being served to your models. This is especially true for real-time production ML applications like recommender or fraud detection systems. (Read more about feature monitoring for real-time machine learning.)
Q: What is an example of a use case you’ve come across in which switching from batch to real-time ML was the only way forward for a company not to hemorrhage money?
Fraud. Real-time ML is critical in fraud use cases. For example, by switching to real-time ML, Instacart reduced the cost of fraud by a few million dollars a year. Batch for fraud only really makes sense in situations like wire transfers, where transactions take 24 hours or longer to be processed.
Q: The perception today is that moving to real-time ML requires significant infrastructure lift. Before committing to the infrastructure lift that’s necessary for real-time ML, how does an organization know that this investment will have a positive return?
A: Ideally, an organization should perform back testing as much as possible. If a model isn’t fully running in production yet, the organization can train it offline and assume that they’ve replicated offline the same outcomes that online real-time data would yield.
For instance, as part of the process of generating a training data set and testing the features made available to the model, a data scientist or ML engineer could decide the level of freshness of the features they are serving to the model. For example, they could choose to hide or include features that are 1 minute, 1 hour, 12 hours, or 1 month old. If done rigorously, this process would allow them to map out how the model performs depending on these time-windowed features and help them asses if introducing real-time data could create significant uplift.
Of course, this wouldn’t account for the whole picture because ML applications are often subject to second-order effects where a prediction could actually impact a user’s behavior. For example, in the case of Lyft or Uber, testing dynamic pricing with a high level of confidence is nearly impossible due to the uncertainty of how users and drivers will react to price changes. In this situation, it makes more sense to run an experiment in production, perhaps as an MVP at a limited scale, to assess how users actually react and how the prediction impacts their behavior and to include those results in future predictions on additional users.
Q: What are some of the most common challenges of building features from raw data sources?
Data source: One challenge when it comes to building features from raw data sources is that depending on the data source—whether it’s a data warehouse, a transactional data source, a stream source, or in-memory data like prediction request data—the characteristics of that data vary enormously. For instance, data from a data warehouse typically contains the complete history and can enable large-scale batch aggregations. In contrast, data from a stream may typically only include information from the last two weeks and only allow time window aggregations or very recent role-level type transformations. This constrains the types of features that data can yield.
Latency: The next challenge is serving those features in production. The application making the prediction won’t be able to ship every query against a data warehouse instantaneoulsy or in a matter of milliseconds. That’s where an online feature store or production store betwen the model and the data warehouse come into play. This setup decouples feature calculation from feature consumption so that the predictive application can fetch feature values at the serving latency that the model requires.
Train-serve skew: Even when all the different infrastructure pieces are stitched together, one of the biggest challenges in real-time ML is identifying and debugging training-serving skew. Train-serve skew happens when models are trained on features that would never exist in production. For example, let’s take a system with two different feature implementations, offline and online. Suppose those implementations contain even the most minor differences, like amputating a null value slightly differently. In that case, the model will make predictions in the production system that don’t follow the same rules as the testing environment. In the best-case scenario, the predictions will be horribly wrong and, therefore, easily detected. In the worst case, the predictions are just not as good as they could be, and the problem can go undetected or misunderstood, resulting in important losses over time.
Monitoring: Finally, monitoring feature values can also be very tricky. These monitoring challenges are, in a way, related to the risk of train-serve skew because a well-implemented monitoring system will make sure that the features the stores are serving are valid, fit a specific schema range, and are consistent online and offline.
Q: What is the percentage of enterprises already using real-time ML?
In one user study conducted by Chip, she found that only about 5% of the people she spoke to wanted to use real-time features. But when the organizations pushed to get the appropriate infrastructure in place, they found that by moving to real-time ML, some of their use cases significantly increased performance, user experience, and revenue.
Real-time ML is also very use-case dependent. For instance, almost 100% of transaction processing credit card companies already use real-time ML in one way or another. But getting the proper infrastructure in place is crucial as well as challenging. So in the grand scheme of things, real-time ML is still in the super early innings of becoming mainstream for most companies.
Want to learn more? Check out the full Q&A or watch the full webinar recording, “How to Make the Jump From Batch to Real-Time Machine Learning,” for more information on how you can make the transition to real-time ML as smooth as possible.