Introducing Array Type Features | Tecton


Introducing Array Type Features

By November 8, 2021

We’re excited to announce that Tecton now natively supports Array type features. Our customers are now deploying array features in operational machine learning models. In this article, we’ll go through (1) how arrays are commonly used in operational machine learning systems and (2) an example of how a user can compute a similarity score between a product and a query using embeddings in real time with Tecton.

Array Features in Operational ML

Arrays are a feature data type that can be used across a number of applications. Consider a retailer that serves product recommendations to users based on their current search query and purchase history. Our retailer might build the following kinds of features:

  1. Lists of categorical variables.
    • product_categories: a list of categories a product belongs to, e.g. [shoes, women, outdoors] for a pair of women’s hiking boots.
    • user_last_10_purchased_products: a list of the last ten product ids purchased by a user. Using our streaming capabilities, Tecton can keep this feature extremely fresh.
  2. Dense embeddings.
    • product_embedding: a precomputed embedding based off of each product’s description and metadata.
    • search_text_embedding: a query-time embedding computed from the user’s search text, e.g. "5-piece knife set". This embedding can be provided to the Tecton API to be combined with precomputed features.

Because embeddings have become such an important part of operational ML systems, we dive deeper into how to use them in Tecton in the following section (see this article for more background on embeddings).


Embeddings are a way to transform text, images, or even arbitrary entities, such as a product id, into a lower-dimensional vector representation that captures most of the meaning in the original data.

By natively supporting arrays (including 32-bit float arrays), our customers can now easily bring powerful embedding features into production with a compact online storage format. This matters to our users because it can significantly reduce the infrastructure cost of online storage and serving.

A very common use for embeddings is found in language inputs, where outputs from pre-trained embedding models like Word2vec and GloVe can be used directly as features into models. Another use case we commonly see is employing embeddings to calculate a similarity score between two items and using that score as a feature.

Let’s go back to our example to show how you can compute a similarity score in real time using Tecton. Our customer, the retailer, wants to compare a user’s search to the descriptions of products in the catalogue. Computing a similarity score between every possible search query and every product description is impossible, as there are endless combinations. Instead, the similarity score must be computed between the query embedding and the precomputed product embedding on-the-fly. Tecton allows you to do this with sub-100ms latency. It’s also extremely easy to code:


        'product_embedding': Input(product_embedding),
        'search_text_embedding': Input(search_text_embedding)
    output_schema=StructType([StructField('cosine_similarity', DoubleType())]),
    description="Computes the cosine similarity between a search text embedding and a precomputed product embedding."
def search_product_similarity(product_embedding: pandas.DataFrame, query_embedding: pandas.DataFrame):
    def cosine_similarity(a: np.ndarray, b: np.ndarray):
        return, b)/(norm(a)*norm(b))

    df = pd.DataFrame()
    df["cosine_similarity"] = cosine_similarity(search_text_embedding["embedding"], product_embedding["embedding"])
    return df

The feature author only needs to declare the inputs and a simple pandas definition with the similarity score. Tecton then orchestrates the pipelines to compute and serve the feature on-demand. Tecton is uniquely built to simplify real time machine learning applications.


With the release of native support for array features, our customers are now able to deploy powerful features into production cheaper and faster. At Tecton, we continue to add capabilities that allow our customers to easily put complex features into production. If you are an organization building operational ML models and want to learn more, you can request a free trial here.

Let's keep in touch

Receive the latest content from Tecton!

© Tecton, Inc. All rights reserved. Various trademarks held by their respective owners.

The Gartner Cool Vendor badge is a trademark and service mark of Gartner, Inc., and/or its affiliates, and is used herein with permission. All rights reserved.
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Request a Demo

Request a free trial

Interested in trying Tecton? Leave us your information below and we’ll be in touch.​

Contact Sales

Interested in trying Tecton? Leave us your information below and we’ll be in touch.​