Data engineering Archives | Page 3 of 4

Streaming Architecture with Kafka, Materialize, dbt, and Tecton

Posted by on June 1, 2022 | Featured

Drizly is building out our Data Science stack and streaming infrastructure to match the success we’ve had with the modern data stack on BI. We are currently standing up our architecture using Kafka, Materialize, and dbt. We are planning on adding …

ML Design Patterns for Data Engineers

Posted by on June 1, 2022 | Featured

As machine learning moves from being a research discipline to a software one, it is useful to catalog tried-and-proven methods to help engineers tackle frequently occurring problems that crop up during the ML process. In this talk, I will cover three …

Real-time Personalization of QuickBooks using Clickstream Data

Posted by on June 1, 2022 | Featured

In this session, we will talk about Intuit’s real-time personalization ML pipeline. We will use a self-help use case to show how Intuit provides proactive self-help to millions of users by personalizing content based on user behavior to increase …

Building a Best-in-Class Customer Experience Platform – The Hux Journey – Deloitte Digital

Posted by on June 1, 2022 | Featured

New technologies have been advancing rapidly across the areas of frictionless data ingestion, customer data management, identity resolution, feature stores, MLOps and customer interaction orchestration. Over the same period many large enterprises …

Exploiting the Data Code: Duality Applying Modern Software Development Practices to Data with Dali

Posted by on June 1, 2022 | Featured

Most large software projects in existence today are the result of the collaborative efforts of hundreds or even thousands of developers. These projects consist of millions of lines of code and leverage a plethora of reusable libraries and services …

Towards a Unified Real-Time ML Data Pipeline, from Training to Serving

Posted by on June 1, 2022 | Featured

On a global marketplace like Etsy where buyers come to buy unique, varied items from sellers from around the globe, the inventory of items is constantly changing. User preferences also change in real time as they discover the latest selection being …

The Only Truly Hard Problem in MLOps

Posted by on June 1, 2022 | Featured

MLOps solutions are often presented as addressing particularly challenging problems. This is mostly untrue. The majority of the problems solved by MLOps solutions have their origins in pre-ML data processing systems and are well addressed by the …

Reusability in Machine Learning

Posted by on June 1, 2022 | Featured

In this session we will explore modern techniques and tooling which empower reusability in data and analytics solutions. Creating and leveraging reusable machine-learning code has many similarities with traditional software engineering but is also …

Best Practices for Productionalizing Data & ML Projects

Posted by on June 1, 2022 | Featured

This talk will briefly explore the development lifecycle for data engineering & ML projects before delving into some of the friction points most common when productionalizing those projects. We’ll provide an overview of how large companies like …

Redis as an Online Feature Store

Posted by on June 1, 2022 | Featured

Feature stores are becoming an important component in any ML/AI architecture today. What is a feature store? – In a nutshell, the feature store allows you to build and manage the features for your training phase (offline feature store) and inference …

A Point in Time: Mutable Data in Online Inference

Posted by on June 1, 2022 | Featured

Most business applications mutate relational data. Online inference is often done on this mutable data, so training data should reflect the state at the prediction’s “point in time” for each object. There are a number of data architecture / domain …

Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning Without a Data Lake

Posted by on June 1, 2022 | Featured

Machine Learning (ML) is separated into model training and model inference. ML frameworks typically use a data lake like HDFS or S3 to process historical data and train analytic models. But it’s possible to completely avoid such a data store, using …

Streaming Architecture with Kafka, Materialize, dbt, and Tecton

ML Design Patterns for Data Engineers

Real-time Personalization of QuickBooks using Clickstream Data

Building a Best-in-Class Customer Experience Platform – The Hux Journey – Deloitte Digital

Exploiting the Data Code: Duality Applying Modern Software Development Practices to Data with Dali

Towards a Unified Real-Time ML Data Pipeline, from Training to Serving

The Only Truly Hard Problem in MLOps

Reusability in Machine Learning

Best Practices for Productionalizing Data & ML Projects

Redis as an Online Feature Store

A Point in Time: Mutable Data in Online Inference

Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning Without a Data Lake

Let's keep in touch

Request a Demo

Contact Sales

Request a free trial