Fraud Prevention: Best Practices In ML Observability

apply(conf) - May '23 - 10 minutes

Fraud takes many forms across industries and is constantly evolving. Data scientists and MLOps professionals must similarly evolve in real time, staying a step ahead of model performance degradation and new attack vectors. In this 10-minute talk, we will cover best practices in ML observability for detecting and preventing fraud across industries. We will also discuss a novel approach to anomaly and drift detection using embeddings, UMAP dimensionality reduction, non-parametric clustering, and data visualization.

Dat Ngo:

So, hello, everyone. My name is Dat Ngo, I am a solutions architect here at Arize. So, the first thing I want to talk about you with is really how does ML observability fit into the stack? And so the ML stack might include things like your data source, you might have a feature store like Tecton, for instance. You may have a model store, something open source like MLflow, and obviously something that’s doing some sort of model serving. And really that’s where Arise or ML observability really fits in is when your models are actually serving in production. So as an example, I’m going to share my screen really quickly. And so let me know if you can see my screen.

Dat Ngo:

But really what I want to chat about today is what is ML observability? If I had to put it very simply, it’s two things. The first thing ML observability is one, let me understand when something has gone wrong with my models in production.

Dat Ngo:

And really the second is, let me share my screen here. The second is really, once you know something has gone wrong, you really want to understand why did it happen, where did it happen and how can I prevent this in the future? So that’s what we think about when we think about observability. So maybe you’re in the fraud space and you have a fraud model that’s kind of helping you understand, hey, are these transactions fraudulent or not fraudulent? So what are some of the issues you might be seeing in production? Well, it might actually look something like this.

Dat Ngo:

These problems can range from performance issues to data drift all the way from to data quality issues and beyond. But the idea here is there’s a multitude of things that can happen in production and you really want to understand when they happen and how can they pinpoint when these things happen. What’s the difference between these kind of two things? And so we’re going to take a quick look. I’m going to do a very fast kind of demo to show you and help you understand what are the things I can discover using ML observability.

Dat Ngo:

And it might all start off with something like this. It’s an alert that says, hey, maybe we have an issue with our model. Maybe there’s something that’s kind of gone wrong and it will take us kind of into the model and we can understand things like, hey, is the performance of my model up to par? Are there any hotspots in my feature that it may be not performing kind of super well? How is this data performing compared to maybe what I trained my model on? Is there a big difference between the two? Can we surface up what those issues are? And so that’s for things like performance.

Dat Ngo:

One thing I recommend for folks when they’re looking at performance is a lot of times us data scientists and machine learning engineers like to look at the statistically based kind of metrics. You can think false negative rate, AR, PR AUC, AUC, things like that. But it’s always important to tie this to the business, meaning maybe I want a metric that actually measures what is the total fraud caught in dollars by this model. Maybe I want to understand that and see, hey, are there any areas that maybe I’m missing fraud. When we think about data drifting or how things are moving, maybe at time of training, I knew the distributions of my model, but maybe those distributions have shifted since I’ve last deployed the model and maybe I don’t have production kind of [inaudible 00:03:46]. Maybe I don’t actually know if it’s fraud or not fraud. How can I tell if my model is healthy or not healthy?

Dat Ngo:

But we can use things like drift. Very conceptually, to understand drift is if I’m looking at two distributions, you can think the blue one here is production, the purple one here is my baseline or maybe during training time. When this metric is very low, when this value is very low, they’re more similar. And obviously when this metric is high, they’re obviously much more different. So you can think of drift as a kind of risk metric to the model. And of course we can look at all the input features, how those things are changing. And so those are the things that I want you to wrap your head around as far as what ML observability really helps with. It helps you understand when things go wrong and why things go wrong and what can you do to make those things better.

Dat Ngo:

The next thing I want to cover is maybe some new techniques in the field that are kind of happening. So foundational to a lot of the newer models, a lot of the cutting edge of the future are embeddings. So if you’re not familiar with embeddings, you can think of them as mathematical representations of data. Usually it’s unstructured data. You can think computer vision models or language models. I’m sure we’ve all heard of large language models as well. Inputs and outputs of those can all be represented as embeddings. You can think of these as a set of numbers that represent that unstructured medium and one cool value of these embeddings that embeddings, that semantically mean the same or like each other are similar to each other in the embedding space. And obviously if they’re not, they’re going to be further away. So what are some of the things we can do with these embeddings?

Dat Ngo:

Well, we can actually take these embeddings and measure them and figure out, hey, is this thing drifting relative to this thing? Is the real world around my model changing? And so when we look about at how this applies into the fraud world, we can do things like I can actually take my tabular fraud data if I know my transaction size, the user, where they spent, what they spent their money on, how much, things like that, I can actually encode this data into an embedding. And you can do things like understand, we can understand univariantly how things are changing, but maybe we want to understand in a multivariate sense how things are changing relative to each other or many things.

Dat Ngo:

So what we can look at here is in this visualization we can see, hey, when I’m comparing my data that I used for training versus my data that we used for, that we’re getting in production, we can measure, hey, how different or similar are they? I said, embeddings are very highly dimensional math representations. You can actually project those math representations into a more intuitive space for a human meaning instead of asking you to project into a thousand dimensions, what we’re going to do is project these transactions into a 3D three-dimensional dimension.

Dat Ngo:

So here what we can see is each of these points represents a transaction. And what you can do here is get a sense because if you’ve ever dealt with fraud attacks, you understand that there’s certain signatures to fraud attacks meaning fraud attacks don’t happen the same way every time actually as one of fraud attack becomes known, right? You can think lenders and people who catch fraud kind of catch onto it. So fraud is constantly kind of a moving target.

Dat Ngo:

But what we can actually look at here is we can project each transaction into the 3D space. We can see, hey, what is my kind of baseline data look like and does it overlap really well with my production data? We can actually slice in color tabular data on this kind of embedding space. So it’s like, hey, if I wanted to see, hey, based off loan amount, do I see any patterns in this embedding shape? Is there anything that’s happening?

Dat Ngo:

Let’s say we actually know which transactions are fraud. We can say, hey, my model guessed correctly that this was not fraud or wasn’t fraud. Or if we want to color by our confusion matrix, for instance, and we want to optimize for our false negative rate, for instance, if we look down here, I can remove everything but those false negatives, meaning where is my model missing fraud? Is there something that these points kind of have in common? Is there something that they share from a fraud attack kind of perspective? Maybe it’s a certain transaction size with a certain location with a certain kind of user, things like that. But I really wanted to show the audience really kind of where the precipice of where kind of fraud catchers are kind of thinking, really getting into this embedding space would be super important for them.

Dat Ngo:

And so that’s what I have to show the audience today. Really excited to show, hey, what are the insights we can use with the embeddings that we have for tabular data? Happy to take on other questions as well. So thank you so much for attending this Lightning Talk with me. My name is Dat Ngo, I’m a solutions architect here at Arize. Thank you so much.

Dat Ngo

Data Scientist and ML Engineer

Arize

Dat Ngo is a data scientist and machine learning engineer who works directly with Arize AI users to monitor and improve their ML models. Before Arize, Ngo led strategic data science efforts at PointPredictive, alliantgroup, and Wood Mackenzie. Ngo has a Master of Science in Applied Statistics from Texas A&M University.

Add Your Heading Text Here

Fraud Prevention: Best Practices In ML Observability

Dat Ngo

Follow Us

Book a Demo

Contact Sales

Request a free trial