Drift-Aware ML Systems

Tecton and Arize Integration Guide
Predictive models are used to detect some pretty important stuff, like whether a credit card transaction is legitimate or not. But your models are only as good as the data you’re feeding them with – and your data can change over time, or even be broken upstream of your model. So, how do you make sure that the online feature values used for inference are still aligned with the features used to train your models?
Why feature drift detection is important

Model Performance Stability:
Models are typically trained on historical data. When the properties of the input features change significantly, the model might not generalize well to new data, leading to performance degradation.
Early Warning System:
Detecting drift in features can act as an early warning mechanism. Before noticeable deterioration in predictive performance occurs, teams can be alerted to the changes in data distributions and take proactive measures.
Data Integrity and Reliability:
Regularly monitoring for feature drift helps ensure that the incoming data remains consistent with the data used in training. This is critical in regulated industries or any high-stakes environments where decision-making relies on accurate predictions.
An approach for implementing feature drift detection
A feature platform like Tecton, uses consistent feature definitions for offline training and online serving which removes the initial risk of training/serving misalignment. It manages the data pipelines and continuously keeps features fresh and ready to serve for inference.
While feature platforms usually include some basic data quality metrics and even data quality validation, the feature platform itself doesn’t usually offer extensive facilities for monitoring model performance or feature drift.
For ML use cases, Arize offers a robust solution for monitoring and managing machine learning model performance in production, specifically by tracking data quality, performance, and drift – both in model output, and in the features supplied as inputs. Arize provides a powerful early warning system that closes the loop in the ML lifecycle to drive ongoing model improvement.
Integration of a feature platform and Arize’s monitoring capabilities is fairly simple. Since the feature platform is used to create training datasets and serve features online for inference, we can serve those same feature values and predictions to Arize for logging and monitoring.
Tecton + Arize Integration Data Flow
When training data is used to build a model, it can be uploaded onto Arize to create the training baseline that establishes expected feature and prediction distributions. When a model makes predictions using features from Tecton, the event identifier, the resulting prediction and input features are logged in Arize. At a later time, when the events are labeled, the ground truth associated with each event is uploaded to Arize. This enables Arize feature and model drift tracking by comparing values to the training baseline.
Here’s an example that uses Tecton training data and inference time logging with Arize.
Setting up a fraud detection feature service in Tecton
For model training, we start with some labeled event data, in this fraud detection example the label is “is_fraud
”:
Tecton uses feature views to transform raw streaming, batch and real-time data into features used for both training and serving models online.
First, we’ll create a streaming feature view to describe recent account activity.
With Tecton, features are managed as python code modules typically written in interactive notebooks. When this code is applied to Tecton, data pipelines are created to update feature values for training and serving in an ongoing fashion so you are always ready to train, retrain or serve models in production.
A Streaming Feature
This fraud detection streaming feature example calculates recent spending totals and 30 day moving means and standard deviations:
@stream_feature_view(
source=transactions_stream,
entities=[user],
mode="pandas",
timestamp_field="timestamp",
features = [
Aggregate( name="sum_amount_10min", function="sum",
input_column=Field("amount", Float64),
time_window=timedelta(minutes=10)),
Aggregate( name="sum_amount_last_24h",function="sum",
input_column=Field("amount", Float64),
time_window=timedelta(hours=24)),
Aggregate( name = "amount_mean_30d",function="mean",
input_column=Field("amount", Float64),
time_window=timedelta(days=30)),
Aggregate( name = "amount_count_30d",function="count",
input_column=Field("amount", Float64),
time_window=timedelta(days=30)),
Aggregate( name = "amount_stddev_30d",function="stddev_samp",
input_column=Field("amount", Float64),
time_window=timedelta(days=30)),
],
)
def user_txn_recent_activity(transactions_stream):
return transactions_stream[["user_id", "timestamp", "amount"]]
A Realtime Feature
We’ll also need a way to identify outliers when a new transaction comes in, so we’ll use a realtime feature view to calculate a z-score:
request_ds = RequestSource(
name = "request_ds",
schema = [Field("amount", Float64)]
)
request_time_features = RealtimeFeatureView(
name="request_time_features",
sources=[request_ds, user_txn_recent_activity],
features = [
Calculation(
name="amount_zscore_30d",
expr="""
( request_ds.amount -
user_txn_recent_activity.amount_mean_30d
)
/ user_txn_recent_activity.amount_stddev_30d
""",
),
],
)
Finally, we’ll roll both of these feature views into a single feature service that we’ll use to deliver point-in-time correct training data and serve live features in production to a fraud detection model:
from tecton import FeatureService
fraud_detection_feature_service = FeatureService(
name="fraud_detection_feature_service",
features = [ user_txn_recent_activity, request_time_features]
)
tecton apply
The `tecton apply
` command publishes the data sources, feature views and feature services defined in the python modules into the platform and initiates data processing jobs to populate the offline and online stores:
> tecton apply
Using workspace "fraud_detection" on cluster https://fintech.tecton.ai
✅ Imported 1 Python module from the feature repository
✅ Imported 1 Python module from the feature repository
⚠️ Running Tests: No tests found.
✅ Collecting local feature declarations
✅ Performing server-side feature validation: Initializing.
↓↓↓↓↓↓↓↓↓↓↓↓ Plan Start ↓↓↓↓↓↓↓↓↓↓
+ Create Stream Data Source
name: transactions_stream
+ Create Entity
name: user
+ Create Transformation
name: user_txn_recent_activity
+ Create Stream Feature View
name: user_txn_recent_activity
materialization: 11 backfills, 1 recurring batch job
> backfill: 10 Backfill jobs from 2023-12-02 00:00:00 UTC to 2025-03-25 00:00:00 UTC writing to the Offline Store
1 Backfill job from 2025-03-25 00:00:00 UTC to 2025-04-24 00:00:00 UTC writing to both the Online and Offline Store
> incremental: 1 Recurring Batch job scheduled every 1 day writing to both the Online and Offline Store
+ Create Realtime (On-Demand) Feature View
name: request_time_features
+ Create Feature Service
name: fraud_detection_feature_service
Great! Now we can query our feature service to get training events.
Logging Training Data
Data scientists use the feature service method get_features_for_events
to retrieve time consistent training data from a notebook. Depending on the volume of the data, they can run training data generation using local compute or larger remote engines like Spark or EMR.
Given a dataframe of training events (labeled or not), get_features_for_events
enhances each event with time-consistent feature values. Training data needs to be time consistent to prevent initial feature drift in production. This means retrieving feature values exactly as they would have been calculated at the time of each training event.
training_data = fraud_detection_feature_service.get_features_for_events(training_events)
Here’s a sample of the resulting training_data
:
This training data is then used to train and validate an ML model in your favorite ML development platform, and this is the point where we integrate Arize logging into our process.
Registering and baselining our model in Arize
Next, we’ll want to log the model and training data with Arize to create the baseline for the model training data and its outputs.
Arize code integrates directly into your model training code and runs wherever and whenever you run your training. This might be in a notebook or a scheduled training script that is used when a model is ready to be deployed to production.
First we define the model schema:
# define schema of the training and prediction data
arize_schema = Schema(
actual_label_column_name="is_fraud",
prediction_label_column_name="predicted_fraud",
feature_column_names=input_columns, # model inputs
prediction_id_column_name="transaction_id",
timestamp_column_name="timestamp",
tag_column_names=["user_id"]
)
We also need to let Arize know what our baseline data looks like for this model, so it has a reference to compare new feature values for drift.
The trained model is used to calculate predictions for the baseline training data, the true outcome of the event (fraud_outcome
) is added along with the model prediction (predicted_fraud
):
input_data = training_data.drop(['transaction_id', 'user_id',
'timestamp', 'amount'], axis=1)
input_data = input_data.drop("is_fraud", axis=1)
input_columns = list(input_data.columns)
#calculate predictions
predictions = model.predict(input_data)
# add prediction to training set
training_data['predicted_fraud']=predictions.astype(float)
# add outcome column
training_data.loc[(training_data["is_fraud"]==0),'fraud_outcome'] = 'Not Fraud'
training_data.loc[(training_data["is_fraud"]==1),'fraud_outcome'] = 'Fraud'
display(training_data.sample(5))
Here’s an excerpt of the baseline training data:
And here’s how we log the model and training data:
response = arize_client.log(
dataframe=training_data, # includes event columns, training features and outcomes
schema=arize_schema,
model_id='transaction_fraud_detection',
model_version='v1.0',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION],
environment=Environments.TRAINING
)
The environment parameter defines whether this is training, validation or production data. Validation data along with its predictions is also logged in the same fashion.
Defining a simple model application
Now that we have our feature service set up, we’ll build a simple application to run our model, feeding in live features from Tecton.
This sample application logic runs in a notebook cell to illustrate the steps used to retrieve online features and calculate an inference using a model. In production, getting features from Tecton and evaluating the model is part of the production application.
current_transaction = {
"transaction_id": "57c9e62fb54b692e78377ab54e9d7387",
"user_id": "user_1939957235",
"timestamp": "2025-04-08 10:57:34+00:00"
"amount": 500.00
}
# feature retrieval for the transaction
feature_data = fraud_detection_feature_service.get_online_features(
join_keys = join_keys = {"user_id": current_transaction["user_id"]},
request_data = {"amount": current_transaction["amount"] }
)
# feature vector prep
columns = [ f["name"].replace(".", "__")
for f in feature_data["metadata"]["features"]]
data = [ feature_data["result"]["features"]]
features = pd.DataFrame(data, columns=columns)[X.columns]
# inference
prediction = {"predicted_fraud": model.predict(features).astype(float)[0]}
Hooking up Arize in our model application:
When a model makes predictions using features from Tecton, the prediction result along with the input features and event identifiers are logged to Arize as an event. This enables Arize feature and model drift tracking by comparing values to the training and validation baselines.
So we add this code to each prediction:
# we put together the full event
publish_data = [ current_transaction | features.to_dict('records')[0] | prediction ]
# we add the yet unknown actuals as None
publish_data = [ publish_data | {"is_fraud":None, "fraud_outcome":None} ]
# and log it in Arize as production data
response = arize_client.log(
dataframe=pd.DataFrame(publish_data),
schema=arize_schema,
model_id='transaction_fraud_detection',
model_version='v1.3',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION],
environment=Environments.PRODUCTION
)
Since this is a fraud detection example, the ground truth is whether the transaction is fraudulent or not. In most cases this information is unknown until some arbitrary time later, like when the cardholder reports a transaction as fraudulent. This is the reason that Arize models include the prediction_id_column_name
which indicates which data column uniquely identifies an event. You can use Arize’s asynchronous logging of the actual outcome column by using the prediction_id_column_name
to associate a logged prediction with its corresponding ground truth.
Example:
# log ground truth, only include actual label and prediction id columns
arize_schema = Schema(
actual_label_column_name="is_fraud",
prediction_id_column_name="transaction_id",
)
event_update = [
{
'is_fraud': 1.0,
'transaction_id': '57c9e62fb54b692e78377ab54e9d7387',
},
]
response = arize_client.log(
dataframe=pd.DataFrame(event_update),
schema=arize_schema,
model_id='transaction_fraud_detection',
model_version='v1.0',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION],
environment=Environments.PRODUCTION
)
Payoff
We did all this work so that we can leverage Arize to give us feedback on model performance, and so that we can get early alerts on our feature data pipeline. This feedback can then be used to adjust feature engineering in Tecton and/or retrain the model.
Here’s what Arize is now reporting to us where we can clearly see prediction drift:
Understand what features are drifting:
Examine individual feature drift:
Boom! We now have both a great monitor of our model performance – ensuring we don’t get worse at detecting fraud over time – and an early warning system for issues in our upstream data or skew in our features.
Add Monitors and Alerting
By setting up Arize monitors, you can get notified whenever any of the features or model distributions drift beyond a threshold.
Learn more about Tecton and Arize: