Product Update: More Data Flexibility, Control, and Quality | Tecton


Product Update: More Data Flexibility, Control, and Quality

By October 6, 2022

If you work with machine learning in any form, then you know getting powerful models to production, faster, is the intent. But in order for an individual, a team, and/or an organization to develop and scale machine learning systematically, the people involved need the ability to tame, train, and manage the data and underlying systems that fuel predictive applications and products in production, from input to output.

With each new release, Tecton strives to help its customers do just that: transform raw data into powerful predictive signals and use those signals on demand to power predictive models, cost-efficiently.

Our latest release, Tecton 0.5, has exciting new capabilities designed to give our customers more flexibility and control of their features and underlying systems—all the while accelerating their journey toward real-time machine learning.

Advanced Data Flexibility & Quality Capabilities

A model is only as good as the data that powers it. That’s why Tecton 0.5 improves how you can access and interact with data.

Serverless feature retrieval. No Spark required!

What this means: Tecton’s SDK can now leverage AWS Athena compute—an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL—to generate training data sets from materialized features. 

Why this matters: This enables fast offline feature retrieval without the need for Spark. It’s particularly useful if you want to generate training data sets using Tecton as part of Airflow, Kubeflow, Dagster, etc.

Unlimited data source flexibility with Spark data source functions

What this means: You can now use functions to define data sources for both batch and streaming Spark features. Whatever you can do in an interactive Spark notebook, you can now do in Tecton. 

Why this matters: By simply writing any PySpark function that returns a DataFrame, you have unlimited flexibility in data source types, authentication mechanisms, schema registry integrations, partition filtering logic, and more.

Improve models with batch feature view skew reduction

What this means: In order to select historically accurate feature values, Tecton’s time-travel queries now consider more information, such as scheduling details.

Why this matters:Reducing online / offline skew is critical to achieving good model quality. Tecton now further ensures that offline feature data reflects the values that would have been available in the online store at a given time.

Additional Transformation Functionalities

From a simple feature definition, Tecton compiles and orchestrates production-ready pipelines that transform batch, real-time, and streaming data into predictive features. Tecton 0.5 lets you further fine-tune how and when to materialize these features for consumption.

Program upstream job triggers with the Feature Materialization API

What this means: You can now trigger feature materialization jobs programmatically. In other words, Tecton now makes it easy to use upstream data pipelines that run outside of Tecton to kick off feature processing as soon as new raw data is ready. The API can also be used to monitor feature materialization job completion statuses in order to kick off training or inference when new feature data is ready. The Tecton Airflow provider makes leveraging this API in Airflow DAGs quick and easy!

Why this matters: Manage your entire ML pipeline, from feature materialization and ML model training, all the way to making ML predictions, in the pipeline orchestration tool of your choice (Airflow, Kubeflow, Dagster, Prefect, etc.).

Trigger event-driven applications with new feature updates using Feature View Output Streams

What this means: You can now enable event-driven applications that react to new feature updates in Tecton. Tecton 0.5 supports both Kafka and Kinesis.

Why this matters: Once you configure the output stream for a feature view, Tecton will write records to that stream for every new value processed. For example, if you’re building a movie recommendation system, you may want to refresh “watch next” recommendations in the background after a user clicks on a new title.

Optimized Cost Capabilities

Real-time predictions drive real revenue, but they can also incur costs. That’s why Tecton 0.5 gives you more options to better optimize and dynamically scale resource utilization based on pre-defined requirements.

Optimize costs with “Suppress Object Recreation” 

What this means: By default, Tecton automatically re-materializes feature data when changes are made to a feature’s transformation logic. This keeps historical feature data accurate. With the “Suppress Object Recreation” function, Tecton admins can now choose to suppress the recreation of objects and avoid unnecessary materialization costs.

Why this matters: Tecton’s Command Line Interface (CLI) now offers greater control over evolving feature pipelines and their underlying costs. With Tecton 0.5, admins can choose to avoid rematerialization costs if the changes do not affect feature semantics (e.g., commenting code, extending a data source schema, changing to a mirror data source).

Going Further…

But that’s not all: Tecton 0.5 also optimizes feature retrieval on Spark with a more stable and performant implementation of the point-in-time join, supports structs as feature types, and makes it easy to programmatically access metadata via the Python SDK.

Interested in learning more about Tecton 0.5? Dig into our documentation and check out our What’s New page for the latest updates. Or if you’d like to try Tecton for yourself, sign up for a free trial.

Let's keep in touch

Receive the latest content from Tecton!

© Tecton, Inc. All rights reserved. Various trademarks held by their respective owners.

The Gartner Cool Vendor badge is a trademark and service mark of Gartner, Inc., and/or its affiliates, and is used herein with permission. All rights reserved.
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Request a Demo

Request a free trial

Interested in trying Tecton? Leave us your information below and we’ll be in touch.​

Contact Sales

Interested in trying Tecton? Leave us your information below and we’ll be in touch.​