We’re excited to announce that version 0.7 of the SDK is now available! This release gives customers new capabilities to define more powerful features, ingest feature data into Tecton, and much more.
New Aggregations & Complex Feature Types:
Tecton added Percentile and Count Distinct to the robust set of built-in aggregations supported by Tecton’s feature engineering framework. As with Tecton’s other built-in aggregations, these are performant, simple to write, available for both batch and streaming features, and guaranteed to be consistent across online and offline environments. For more details, see the documentation.
This release also gives customers additional flexibility when defining features with new support for Map and Struct-type feature values, along with support for multi-dimensional Arrays. First-class support for these types will give users more ergonomic and performant feature definitions when working with complex data. For more details, see the documentation.
Enhanced Python Environments for On-Demand Feature Views (Public Preview):
With 0.7, developers can now choose different Python Environments for running their On-Demand transformations. These Python Environments enable developers to leverage common Data Science packages in their On-Demand transformation logic. For example, they can use the
fuzzywuzzy package to calculate the similarity between a user’s search terms and a product’s name.
To get started with Python Environments, upgrade to Tecton 0.7 and see the documentation.
Stream Ingest API (Public Preview):
- Integrate Tecton with any existing streaming feature pipeline without migrating feature code. The Stream Ingest API lets teams get all the data management, serving, governance, monitoring, etc. benefits of Tecton’s Feature Platform on top of their existing feature pipelines, without having to rewrite any feature code. This means no need to migrate feature pipelines that are already working to get started using an enterprise feature platform, making it faster and easier for ML and DS teams to get their features centrally managed for trusted and reliable training and serving.
- Easily build powerful streaming features on event data using Python and performant aggregations: Tecton’s Serverless Python and Aggregations Engines enable Data Scientists and ML Engineers to author and manage transformations in familiar Python, allowing you to skip the complicated code and heavy stream processing infrastructure required by other solutions.
- Bring read-after-write consistency to your feature infrastructure: The Stream Ingest API can block until input data has been fully processed and corresponding features are updated, making it easy for your application to push event data to the feature platform and quickly retrieve up-to-date feature vectors — something very useful for event-driven decisioning applications like loan approvals and fraud monitoring.
To get started, see the documentation — current Tecton customers get their first 10M writes to the Stream Ingest API for free!
Support for Databricks Unity Catalog:
Tecton now supports data sources managed by Unity Catalog, Databricks’ new unified data governance solution which provides a centralized interface for data assets, fine-grained access control, data lineage, improved data sharing, and other new capabilities. In 0.7, Tecton customers can use the new
UnityConfig option to connect to Unity data sources. See the documentation to learn more.
Feature Views configured for manual materialization (instead of scheduled materialization) can now be automatically backfilled up to a specified timestamp (using the new
manual_trigger_backfill_end_time parameter). See the documentation for more details.
Tecton’s CLI now supports command autocompletion. Run
tecton completion -h to get started and see the documentation for more details.
Customers can now invite users and manage roles in bulk using the Tecton CLI. Run
tecton user invite -h or
tecton access-control assign-role -h for instructions.