Fighting Financial Crime With Machine Learning at Tide

apply(conf) - May '23 - 30 minutes

Tide offers business accounts to SMEs (small and medium enterprises) and is on a mission to save them time and money so they can get back to doing what they love. Our goal at Tide is to become the world’s leading business financial platform for business owners who are burdened by the numerous financial tasks required to run a successful business.

Tide uses data-driven decision-making to manage risk at different stages of the customer journey. We will be focusing on FinCrime risk management at Tide and the technical architecture associated with training, hosting, and running ML models to facilitate this.

Aravind:

This is Aravind. I am the lead DS at Tide in risk and compliance, and I have been working on FinCrime based use cases with Tide for close to three years now. I’ve also worked with multiple FinTech startups over the years and I started out with enterprise application integration early on in my career and then moved into data science halfway through it. And I have close to two decades of experience in it. But yeah, that’s about me. I’ll pass it on Tocho Tochev, if he has not introduced himself yet and then we’ll move on to the slides. Yeah.

Tocho Tochev:

Hi guys. My name is Tocho Tochev, I don’t know if you can see me because of the technical difficulties, but I’ve been a longtime Python developer. I have been with Tide for two years. I started in the ongoing monitoring domain and now I’m heading the ongoing infrastructure team at Tide. Over to you Aravind.

Aravind:

Thank you Tocho. If you can bring up the slides. Okay. I’m still seeing… Okay. Yeah. I think we have the slides up, so I think I will go ahead. The agenda for today’s discussion is going to be how do we combat Financial Crime using machine learning at Tide. And this is a high level agenda that we are going to go over. We’ll go over what Tide is as a business into and what are the services that it provides to its customers. We’ll also go over the types of FinCrime that are addressed at Tide and how do we tackle them at the moment in terms of ML models. And then we’ll do a deep dive into the technical architecture and we’ll also go over the types of monitoring that we have set up to ensure that things are going smoothly.

Aravind:

We’ll also talk about model explainability with our models and we’ll conclude with some of the exciting things that we are doing in terms of innovation and as we’re looking forward to in the future. And we’ll also take on any questions that we might have towards the end.

Aravind:

So, to introduce Tide as a whole and the business that we are in, Tide actually offers a mobile based business account to its members or customers. And Tide is on a mission to actually save businesses time and money so that they can get back to doing what they love. And the vision for Tide is to become the world’s leading Business Financial Platform that is available to at least 25% of the world’s small and medium enterprises.

Aravind:

So, these are at a high level the mission and vision statements for Tide. If you move on to the next slide. This is a snapshot of all the services or some of the services that Tide provides to their consumers. And not just limiting to these, you have a reference to how the Tide app looks in the mobile app. And some of the services that Tide provides are the ability to have control over the spending or the expenses that are made by the company themselves.

Aravind:

And that also provides cashflow insights to its members. And there’s also a credit solution where business loans could be also applied and provided. There’s also invoices, creation and tracking. These are some services and these are not exhaustive of course, and that are provided as part of the Tide app to its customers. If we move on to the next slide.

Aravind:

So, at a high level, these are the strategic themes that we follow. As part of Financial Crime, we strive to come up with specialized solutions that help in detecting specific fraud technologies or FinCrime technologies in specific markets. We also try to strive for having a one solution that can be used or applied for all regions. We also strive continuously to reduce friction for legitimate users while we handle financial time. We also put a lot of focus in explaining the decisions that we make using our ML models and another theme that actually drives us to be thought leaders in terms of anti-Financial Crime techniques and technologies that we use.

Aravind:

And we’ll actually dwell deeper into each of these themes as we move along. So, if we move to the next slide. We will go over the FinCrime models that we have set up at Tide to begin with. And this is in line with the specialist team that we have talked about. To define a Financial Crime.

Aravind:

Financial Crime could be defined as any criminal activities that are carried out by individuals or organizations as such so that they can reap economic benefits through illegal methods. There are different types of Financial Crimes that are affecting the space that Tide is operating in, and some of them are money laundering, tax evasion and fraud. I think most of these Financial Crimes are self-explanatory. So, these are faced by most FinTech players in this domain. We will go over how we are actually handling this Financial Crime using different types of modes in the subsequent slides.

Aravind:

When deciding on what makes a certain behavior suspicious, there are different factors that actually come into play. I mean, these are at a high level the different things that we usually check for. So for example, what is normal for a business? For this type of business? So for example, if you have a hair salon, is it usual for them to, let’s say, make purchases that are to do with let’s say luxury items or let’s say, what is normal for this particular user? So, this is a sort of segment of one view into the customer’s behavior. So for example, let’s say there’s a historical pattern of low transaction activity and overnight, let’s say they are starting to make really huge transactions, so that’s a view at a user level.

Aravind:

We also try to see the macro level view to see if the member is operating within the expected behavioral patterns or not. And there are also some additional checks, for example, the network effect, so how the member or the customer is interacting with their usual parties and so on. And we use ML for most of these complex decision making. And we also communicate with the actual user to clear out any suspicions that we have in terms of any markers or indicators that are to do with the FinCrime.

Aravind:

Moving on to the next slide. This is the first ML model that we can talk through. This is when a new customer is trying to sign up for Tide account, they use their mobile to actually sign up for Tide account. And as part of this onboarding process, they take a selfie, they’re also expected to upload some ID documents and there is this automatic matching of the selfie to the ID documents upload. And there’s also different types of information that we gather on these users from external parties as well. And we have an ML model that is used, that uses all this data and it will generate a risk assessment for this member. And depending on the risk policy that is defined for KYC or onboarding risk, there’s a thresholding mechanism that is also applied to actually mark a new user into, let’s say different types of risk bands.

Aravind:

And let’s say, if a member is or user is found to be outside of the risk appetite or high risk appetite, then they are forwarded for manual review as part of the onboarding process. And this has been one of the key models that actually affect in terms of… Although when you would like to handle a Financial Crime, you would also want a frictionless process, so that it’s easy for new members to actually sign up for time and have that good member experience.

Aravind:

Moving on, the next model that we are going to talk about is Ongoing risk. So, let’s say as part of the member journey or user journey, once they are onboarded, they can continue to make transactions. And there’s also this need of continuous assessment of FinCrime risk for the members once they get onboarded. And in contrast to the limited amount of data that we have on members when they get onboarded, we have much richer dataset available to us on an ongoing basis and it uses a lot of parameters and it’s not just restricted to let’s say transaction data or external data or behavioral data that we see.

Aravind:

All this data is actually considered to come up with a continuous risk assessment on members. And we also have a similar risk policy that is referred here, and periodically members are actually forwarded for additional review if we think that their risk is actually outside of the risk appetite as defined by the risk policy.

Aravind:

So moving on, we also have real time transaction screening. This is a model that we use to screen the transactions that are made in real time. This is a model that operates in a low latency environment. So, we have the latency requirements somewhere around 50 milliseconds. This model is responsible for screening different types of Financial Crimes. And this is a model type really. There are a lot of models that operate in the background screening for different types of FinCrime types. And depending on the model prediction and the level of risk that is assessed, some transactions or most transactions are actually allowed to, but the suspicious ones are blocked. And these also trigger investigations by specific teams.

Aravind:

And the investigations also have a detailed process. For example, we have initial triaging team that looks for false positives and we have a complex investigations team that actually looks for something, let’s say, confirmed fraudulent activity. And investigations can also lead to an account getting blocked or subsequently, let’s say for a confirmed fraud, they will be forwarded to NCJ and subsequent off-boarding completely from the platform.

Aravind:

That’s a usual process flow for this model. I think this concludes the discussion about the FinCrime models that we have. I will pass it on now to Tocho Tochev to talk of the technical architecture. Yeah, thank you.

Tocho Tochev:

Thanks Aravind. So, we’re going to talk about the technical architecture, which is part of our global strategic team. And as part of this team, we strive to provide a single platform for all of our markets and approaches.

Tocho Tochev:

Why do we need this? Well, first of all, the process of providing data to facilitate machine learning models, providing the input and monitoring the quality of the output is time-consuming, labor-intensive and error-prone. And we have identified early on that this can be streamlined and sped up by a good machine learning pipeline.

Tocho Tochev:

On the right, you can see diagram from mlops.org, where this process is highlighted at that we pretty much follow this diagram and everything starts with the date. So, what is our data pipeline for the training data type? We start with upstream data sources, which are the backend databases, backend event streams, events from the Tide apps, so that’s on Android, iOS, and web, and also some third parties.

Tocho Tochev:

And this data is then extracted and loaded. We use several different things for extracting and loading this data. For databases, we use usually AWS DMS jobs or Fivetran, for the events we use Fivetran or some Custom integrations, for the third part is we use Custom integrations in the form of a patent and airflow. And for Tracking we use Segment.

Tocho Tochev:

Once this data is extracted, this gets welded into our data warehouse, first in the raw data portion of the warehouse, and then it is transformed using DBT into the final data that is used by the business for generating business reports or for the data scientists to enter business analysis to get some entire insights from the data.

Tocho Tochev:

The data from our data warehouse is fit into our Data bricks environment where the machine learning magic happens, where we have our Tecton feature store. And, what is feature store? you might ask, a feature store is machine learning specific data system that runs pipelines that transform raw data into feature values.

Tocho Tochev:

It starts and manages the feature data itself and serves the feature data consistently for training can inference purposes, basically providing some kind of a point in time state of the system, so that we identified early on that we would greatly benefit from a feature store. And we have adopted Tecton. Here you can see a sample code from Tecton. This is not actually a type code, which creates a basic feature using some underlying data and applying some transformation.

Tocho Tochev:

Why are we using Tecton type? Well, there are several reasons. There are many benefits from Tecton, but the ones that we are most interested in, I don’t know if you can see them behind the picture in picture, are the ingestion of data in real time and serving and being able to serve this data immediately upon request. Tecton is also able to meet our wall latency and high report requirements, which we need for our transaction screen, use cases and for other machine learning use cases which rely on real time.

Tocho Tochev:

And of course, we wipe the spot that we get from the Tectonias. That being said, well one… As mentioned in the previous presentation, one we will always hit some performance limitations where a custom solution might be needed, but it is pretty good to have some generic solution to it. It gets the job done.

Tocho Tochev:

Next, as you can see, Tecton is quite central to our training and inference architecture. So, on the top you can see our training architecture, we have Databricks notebook jobs which extract data from Tecton from our data warehouse, and they use various model training libraries like XGBoost, SQL to train our models as part of CICD pipeline, which registers the models in our model registry, which is MFO.

Tocho Tochev:

And for reference, we have two different types. One is for batch and 20 is for online architecture. The batch architecture is for things that are done in bulk. For instance, it can be periodic review, it can be generates. The batch architecture is pretty simple. It consists of scheduled Databricks jobs which run in a dedicated Databricks environment. They fetch the models from MFO and get data from Tecton and from our warehouse in order to make inferences. And the results are written to our data warehouse or to Kafka as events where they’re consumed by various backend services.

Tocho Tochev:

Our online architecture is much more complex, we have different triggers. One type of trigger is events where we either use the raw events that feed them to Tecton or augment them with external services and feed these data in Tecton streaming features, or we use HTP requests and which call our service, which internal cost Tecton, both decisioning services, searching service types, fetch the models from MFO, use Tecton for returning features.

Tocho Tochev:

We use it as a database and the result is outputted to Kafka where the output is used for backend services to do further actions and also to our data warehouse for audit and explainability purposes. And as you can see, this is quite complex and as you saw, technical issues come all the time including in this presentation.

Tocho Tochev:

The next topic is monitoring where we strike to provide low friction to our members. First, I’m going to talk about real-time monitoring. We use real-time monitoring to decide on whether we should scale up or down our services automatically. Whether our bug green deployment is successful, whether our show testing is progressing. Here, you can see screenshots from our DATADOG monitoring service. And the things to monitor for are: decision outcome rate, decision volume, error rate, latency, resources used in terms of memory, CPU, et cetera, keep track of exceptions that are generated by the models or by the services.

Tocho Tochev:

And optionally, depending on the use case and the requirements, keep a record of all the decisions. At Tide we need to keep record of some decisions for regulatory purposes. A couple of important things when you do monitoring is to have somewhere to visualize the metrics, a good mechanism to get notified and the handling procedure to know who and when should investigate.

Tocho Tochev:

And the other type of monitoring that should be done and we do at Tide is a long-term monitoring of KPIs. Why do we need long-term monitoring? For a lot of things related to FinCrime, we have long feedback loop, why? Because investigations take time. There are sometimes external entities involved such as banks, the users, et cetera. We keep track of our false negatives, we receive external fraud reports, and all of this goes into our KPIs and further training of the models.

Tocho Tochev:

The [inaudible 00:24:49] important thing to watch out for when doing monitoring is to keep track of any potential feature drift because things change. And this brings us to the next point, which is going to be to the next section, which is going to be presented by Aravind on Model Explainability.

Aravind:

Thank you, Tocho. We will talk about Model Explainability theme. We’ll go over why Model Explainability is important and how it fits into the overall Model observability. So, if you can move to the next slide.

Aravind:

Model observability is key for Tide because I think it encompasses multiple things. We have not just model monitoring, but we also have to be able to explain our predictions. I’m not sure if you’re able to see the picture at the bottom right due to the videos that we have there, but to walk through it. Most of the time the data that we actually use to train our model can be slightly different from the data or it could be very different from the data that we actually encounter when we use the model as part of inference pipeline. And this skew between training and inference data can grow over time and not just that.

Aravind:

I mean, there could be inherent issues with regard to the bias that we introduce in the data due to the… Let’s say, if we have a specific demographics underrepresented in data or specific FinCrime that are not properly represented, there could also be data integrated issues. Due to all these reasons, Model Observability is quite important and more so, the reason why we need it is to actually get a deep understanding of the data that we are using and the model performance over its life cycle and to also be able to explain model predictions and in terms of why is it crucial for Tide as part of the FinTech space.

Aravind:

Tide is under legal obligation under GDPR to ensure, let’s say, we have transparency and explainability of decisions that we are making using personal data. And this is also to ensure fair access to services where there is no discrimination against marginalized users or who are underrepresented in the training dataset.

Aravind:

If you move to the next slide. We are actually handling this by partnering with Fiddler. Fiddler is our model observability solution that we are using it in Tide and the Fiddler helps us to actually monitor model performance to also be able to do drill down and root cause analysis when we encounter issues. And we can also set up dashboards accordingly for looking through how the performance is changing over a period of time. So, I have attached a couple of pictures here, which are representative images of what we would see when we use Fiddler.

Aravind:

On the left side you would see how the performance or the distribution of data changes over time. And on the right side you would see the point explanations that we can generate out of Fiddler for specific instances. For example, let’s say the complex investigations team reaches out to us to figure out the reasons for either alerting a false positive or a legitimate user or missing out on actual fraudulent case.

Aravind:

In those instances, and also to corroborate evidence in terms of, let’s say audits and as well, so we actually make use of Fiddler extensively both for model monitoring and also for model explanations. That’s about model observability.

Aravind:

Moving on to the next theme. This is the last thing that we are going to cover as part of today’s discussion. It’s about the innovation and the future work that we have set out to do in terms of FinCrime. So, it is business as usual for the models to be in periodically retrained, but we also make a consistent effort to actually use more and more data, so that we have a 360 degree view of what the customer is doing. This can be sort of partnering with external vendors to actually get hold of a much richer data set that we have on customers.

Aravind:

And in addition to that, we are also looking very closely at the usage of synthetic data at the moment, we have partnered with FinCrime Dynamics and with this partnership we are actually trying to see how we can reduce the reliance on personal data and how well can we use synthetic data to represent, let’s say, underrepresented or evolving FinCrime typologies. And the case in point is APP scams currently in the UK market.

Aravind:

And also, this helps us when we actually launch ML Models into let’s say new markets completely. These are some instances where synthetic data is helping us. And in addition to what we have listed out here, we are also having a big focus on how can we use generative AI in terms of servicing our members and also for internal consumption and so on. I mean, these are some things that we are currently looking into and in terms of an innovation team, and this actually concludes the slides that we have today. In case there are any questions, we can take those on at the moment. Yeah, thank you.

Speaker 3:

Excellent. There actually are some questions that I will start off right now with, and then I imagine as the questions keep rolling in, I’ll keep asking them. So, why were two architectures online and offline used for the decision making process when doing global monitoring?

Tocho Tochev:

The offline architecture is for things that are done in batch. It’s more efficient for things that are done in batch.

Tocho Tochev:

And the online is for things that need to be answered immediately. For instance, when there is an event that demands response right away, such as transaction or something else.

Speaker 3:

Awesome. All right, fellas. Well, I am going to say thank you very much for this excellent talk.

Aravind Maguluri

Lead Data Scientist

Tide

Tocho Tochev

Lead ML Engineer

Tide

Add Your Heading Text Here

Fighting Financial Crime With Machine Learning at Tide

Aravind Maguluri

Tocho Tochev

Follow Us

Book a Demo

Contact Sales

Request a free trial