Tecton

Help us build the future of enterprise ML

careers-hero@2x

Senior DevOps Engineer (SF, NYC or Remote)

San Francisco, CA, Software Engineering, Full-time

About Tecton

At Tecton, we are on a mission to bring world-class Machine Learning to every product and customer experience. Tecton’s founders developed the first Feature Store when they created Uber’s Michelangelo ML platform. In pursuit of bringing ML to every production application, we have since brought the leading commercial feature store to market and built the most popular open-source feature store. 

We are funded by Sequoia Capital and Andreessen Horowitz and have a fast-growing team that works out of SF, NYC, and remotely. Our team has years of experience building and operating business-critical machine learning systems at leading tech companies like Uber, Google, Facebook, Airbnb, Twitter, and Quora, and we’re now bringing those same capabilities to every organization in the world.

The Platform Engineering team is focused on building the foundation for a reliable, high-performance, and scalable feature platform. We are a team of experienced software engineers that are well-versed with deep infrastructure knowledge and have a knack for building quality systems that deliver on our customer’s performance goals. We have a deep understanding of many different storage and compute technologies and are always searching for opportunities to drive higher performance and lower latencies and costs. Delivering a production platform to power each of our customers' machine learning environments is a deeply complex challenge that excites each and every one of us.

As a critical member of Tecton's fast-growing engineering team, you will help lay the foundation for scaling Tecton to the next generation of customers.

As an early member of Tecton’s Platform Engineering DevOps team, you will help lay the foundation for building, automating, and scaling Tecton. You will leverage your deep experience with cloud architectures, distributed systems, containerization technologies (Kubernetes), and Linux system internals to design, build, and maintain our multi-cloud deployments, ensure our systems are secure in-depth, and work closely with the rest of Tecton’s Platform Engineering team to scale and optimize our core online serving systems.

Prior experience with machine learning is not required. We are looking for exceptional DevOps, infrastructure, and software engineers who are driven to find simple solutions to complex challenges. You'll be at the intersection of design, engineering, and operational processes.

Responsibilities:

  • Own the complete lifecycle of Tecton’s cloud infrastructure development from design through automation, deployment, and operation
  • Engage with other engineering and solutions teams to build tools that will accelerate engineering and deployments efficiency
  • Develop and maintain infrastructure and tooling to monitor observability of Tecton health, availability, latency
  • Joint ownership building and managing Tecton’s CI/CD system to reliably deploy production components with a GitOps model including the multi-language, multi-platform Build System based on Bazel
  • Participate in an on-call rotation, triaging and addressing Tecton platform major incidents

Qualifications:

  • Experienced engineer with 5+ years experience in DevOps, SRE, or Software Engineering
  • Passion for excellence and high developer productivity
  • Experience with infrastructure-as-code tools such as Terraform
  • Fluent in one or more programming language such as Python or Golang
  • Expertise in cloud providers such as AWS, Google Cloud, and/or Microsoft Azure
  • Experience building and troubleshooting robust and secure networks
  • Experience with microservices and container orchestration such as Kubernetes and Docker
  • Expertise in monitoring and alerting (Prometheus, ELK, Chronosphere, Datadog etc.)
  • Strong and effective verbal and written communication skills
  • In-depth experience with Linux systems administration and troubleshooting

Nice to have:

  • Experience building reliable CI/CD pipelines (Github, CircleCI, Buildkite, etc.)
  • Experience with Kubernetes configuration management tools (Helm, Kustomize, etc.)
  • Experience with on-call rotation and support of production environments
  • Experience working with complex Build Systems (Bazel in particular)
  • Experience working with large scale data systems/ MLOps
Apply Now

Didn’t find the right position for you?

Contact us for future opportunities

Request a free trial

Interested in trying Tecton? Leave us your information below and we’ll be in touch.​

Tell us a bit more...​

Interested in trying Tecton? Leave us your information below and we’ll be in touch.​

Request a free trial

Interested in trying Tecton? Leave us your information below and we’ll be in touch.​