Engineering for Applied ML

apply(conf) - May '22 - 10 minutes

Applied ML consists of ML algorithms at its core and engineering systems around it. For over a decade as an applied ML practitioner, I have built a number of such engineering systems to help unlock the full potential of ML in a variety of problem domains. This talk is about my learnings in building those systems and patterns that I’ve found to work well across applications.

Hello, everyone. Pleasure to be here to be discuss engineering for applied machine learning. Before you all start to look for the presentation screen, I actually don’t have slides. We just have a chat. And I also start by saying, it’s my first time joining this conference. It’s super cool to hear all the great work that were done by other presenters. You know, it’s humbling to hear everyone’s ideas. A lot to discuss. So my name is Yuchen. I work for Instacart and support our fulfillment marketplace organization. My teams are responsible for developing technologies to power a balanced, sustainable, and visible marketplace. And what that means is we want to efficiently meet consumer demand, and make sure that your grocery gets delivered quickly and efficiently. We want to empower our shoppers and make sure they get meaningful work and earnings opportunities on the Instacart platform.

So machine learning, along with technology innovations across system engineering and operations research are the mainstay of the work we do. We use machine learning to drive a wide range of marketplace opportunities. For example, predictive modeling is used extensively in our systems. It is super important that we know, for example, how long it’s going to take to fulfill an order by modeling how fast can someone drive to the store, pick up all the items in the order from the store, and get those delivered. That way, we can promise the fastest and reliable ETAs to our consumers when they check out, and also dispatch orders to our shoppers at the right moment to make sure that they will arrive on time. Forecasting is another critical component of our system to ensure our users can order whenever they want to at the speed they desire. It is important to estimate how many orders we will receive at what time, and also at which store, so we can plan ahead with shoppers on our platform.

So Instacart is a continuation of my long-time passion for applied machine learning, which actually started in a very, very different domain. It was in 2009, when I was a PhD student collaborating with Google Research in the area of speech recognition. As I mentioned, I didn’t have slides, so it will just be chatting with my screen. So unlike today’s speech recognition products, in 2009, were a little frustrating to use because recognition accuracy was far from perfect at time. So there were lots of efforts put into improving speech recognition at the time. One of those efforts was to evaluate the use of deep neural networks. I did not participate in that effort directly, but was very fortunate to witness the tremendous excitement when deep neural networks saw a huge improvement accuracy compared to state of thought at the time. It was acclaimed by the team as 10 years of speech recognition R&D launched at once.

It was fascinating to see how machine learning algorithms, data, and engineering really came together at that time to advance boundaries of technology and enable transformative product improvements. Clearly, this improvement in speech recognition would not have been possible without the core innovations in different algorithms. The amount of speech data was also critical. As we all know, machine learning, isn’t very helpful without data. But the part I was fascinated most about was innovations in engineering system that made it possible to train such large-scale neural networks. And in fact, a entire area of engineering tools and systems have been invented since then to make [inaudible] development across many applications like those [inaudible] today.

So personally, I have worked on ads, search recommendation, and our marketplace at Instacart. In every single one of these application, I’ve seen the success formula of algorithm, data, and engineering to drive business impact. The algorithm and data often vary across these applications, but I’ve found that the engineering challenges are often very, very similar across these domains.

So today, I’d love to just share one aspect of engineering problems for applied machine learning, which I call idea-to-result. So what is idea-to-result? It is the time it takes between you wake up in the morning with an idea to improve your machine learning in your model to the time when you know that idea is good. So clearly, we want idea-to-result to be as fast as possible. So what gets in the way of us getting results quickly? For example, say you work on a recommendation, and wake up with an idea of a new user feature that you think will improve your recommendation results. However, that feature may not be available in your featured store. And perhaps, it’s not even logged in the production system. You end up spending time writing feature extraction pipelines, or even worse, sometimes having to wait for the production logs to gradually accumulate through logging, which would take months before you can even start to train your model and test the ideas on the experiments.

Or, imagine you read a paper about a cool new ML technique that claims to be effective in your type of ML application. However, the ability to train and evaluate this kind of ML model may not be supported by the current ML workflow you have. So you may have to file a request to have that implemented, and the ML infrastructure team may have to prioritize that in some way, which is for [inaudible]. Sometimes, the bottleneck is neither in [inaudible] engineering nor ML workflow. But the fact that you need to run live experiments to gauge success of your idea, which requires you to go through a full productionization cycle of a test model and wait for experiments to come in. And this would take at least a couple weeks before you can get the results. So this list can go on and on because every application and system are different. But in my experience, they all tend to be things we can improve through analyzing the engineering systems that work between your idea and getting your results, and make them work faster.

So, why does idea-to-results matter? Well, all engineering problems become easier to solve if we can move faster. For most applied machine learning problems, we take an iterative approach by constantly testing new ideas. So the speed at which we can iterate on our ideas really governs how fast we can learn and how much impact the team can generate. Faster idea-to-result also means we can catch bugs earlier. Have you had experience working through weeks to train an ML model, and then more weeks to get the experiment data, only to realize that there’s a bug in your future extraction code that made all the results invalid? While we don’t have many ML app tools to catch these issues automatically and minimize them, I just assume much better if we can catch them sooner.

Also, depending on your ML applications and your business environment, sometimes there are cases where you have to make model changes very quickly. For example, there may be a need to assess a new data provider, where you need to make a business decision in a couple days or weeks. And being able to get from idea to result within reasonable timeframe really becomes critical in those scenarios. If you lead a applied machine learning team like I do, idea-to-result is also an important gauge of productivity of the team, and productive teams are happy teams. And it is fun for machine learning engineers and data scientists because we get to test ideas faster, and then get the feedback loop quicker. And it also fun for backend and data engineers on the team because through the lens of idea-to-results, their work is truly the enabler and force multiplier to all the impact that the team is having through these ML ideas.

So we talked about what is idea-to-result, and why that matters. How do we make it useful? Well, the first step is to start measuring it because we can only improve what we measure. Every ML system probably has a different bottleneck, so go identify those. And very likely, you would be pleasantly surprised by how much faster you iterate through ideas by investing in just a few new tools and infrastructures. If you are a machine learning engineer or data scientist, and you find yourself waiting weeks or even months to test an idea, don’t assume it has to be like that. Surface what takes time between your idea and results. Discuss among your fellow engineers to come up with solutions. For example, oftentimes, a live experiment takes too long to run. And yeah, this may be the time to think about developing offline simulations or offline metrics to estimate production impact of your model without having to run a live experiment for every new idea you have.

If you are a backend or data engineer, your work is probably already making idea-to-results faster. This is truly impactful and high leverage work, and I’m sure there are a lot of opportunities that you can drive even [inaudible] through these lenses.

With that, I think I’m getting pretty close to my 10 minute mark. And thank you all very much for attending this. I hope this is useful and you can sound very passionate about this area on intersectional engineering and machine learning. And please, don’t hesitate to reach out, and look forward to chatting more with you.

Yuchen Wu

VP Engineering

Instacart

Yuchen is VP of Engineering at Instacart for Marketplace. He leads routing, matching, pricing, incentives, earnings and broader technology innovations that power the growth of Instacart Marketplace. Before Instacart, Yuchen was Engineering Director at Google, where he led ML/AI/Product/Infra teams responsible for ads systems.

Engineering for Applied ML

Yuchen Wu

Follow Us

Book a Demo

Contact Sales

Request a free trial