Tecton

Tecton, founded in 2018 by the team which built the Michelangelo ML platform at Uber, is building a feature platform for production ML applications. Tecton’s platform expands on features stores, incorporating data pipelines and infrastructure traditionally built by data engineers and making features self-serve for data scientists. Tecton emphasizes collaboration and acts like a central repository for valuable business signals, empowering data teams to discover and re-use existing features, govern access, and versioning. By integrating with existing data infrastructure such as Snowflake, Databricks, and AWS, Tecton aims to provide a best-of-breed solution, rather than requiring users to move to an end-to-end ML platform like AWS Sagemaker

Founding Date

Jan 1, 2019

Headquarters

San Francisco, California

Total Funding

$ 160M

Stage

series c

Employees

51-100

Careers at Tecton

Memo

Updated

August 3, 2023

Reading Time

16 min

Thesis

Data scientists are facing the same challenges in 2023 around the processes necessary to rapidly deploy, maintain, and monitor ML models that developers had 20 years ago deploying applications. This is not for lack of investment – AI adoption has doubled from 2017 to 2022, but 87% of data science projects still fail to make it to production. Even successful ML projects take on average 7 months to deploy. Developers have an entire toolchain to deploy and re-use code, and today software engineering teams that embrace DevOps practices are able to deploy 208x more frequently and 106x faster. However, data scientists still aren’t empowered to own the deployment and integration of their models, often having to “throw projects over the wall” to Data Engineers, ML Engineers, Product Engineers, and SREs, leading to months-long iteration cycles and organizational conflicts. MLOps, or Machine Learning Operations, brings DevOps to ML. The market is nascent and highly fragmented, estimated at $612 million in 2021 and projected to reach $6 billion by 2028.

Managing the inputs, or features, to ML models has become a critical component of MLOps systems. Amazon announced the SageMaker Feature Store in 2020, Google announced Feature Store as part of its ML platform Vertex AI in 2021, and Databricks announced its Feature Store in 2022. Today, data scientists often work locally to transform raw data into features that can be used in ML models. Getting these transformations into an at-scale, production-ready feature store is often an ad-hoc and time-consuming process with problems around security, collaboration, and standardization. With companies developing “production ML” applications, building these pipelines becomes even more challenging. Production ML applications such as Uber’s dynamic pricing or TikTok’s recommendations demand the latest, real-time features. Tecton CEO Mike Del Balso, in an interview with Contrary Research, said:

“Consumer experiences across every product, every piece of software that you interact with is going to be personalized, updated in real-time. It's going to be interactive and responsive. Everything from when you go to the doctor, to signing into your bank, to making a purchase online. And that's the world we think we're going to. And we think the kind of infrastructure that we're building is a key stepping stone to make it easy for people to get there.”

Tecton, founded in 2018 by the team which built the Michelangelo ML platform at Uber, is building a feature platform for production ML applications. Tecton’s platform expands on feature stores, incorporating data pipelines and infrastructure traditionally built by data engineers and making features self-serve for data scientists. Tecton emphasizes collaboration and acts like a central repository for valuable business signals, empowering data teams to discover and re-use existing features, and govern access, and versioning. By integrating with existing data infrastructure such as Snowflake, Databricks, and AWS, Tecton aims to provide a best-of-breed solution, rather than requiring users to move to an end-to-end ML platform like AWS Sagemaker

Founding Story

Source: Tecton

Tecton was founded in 2018 by co-founders Mike Del Balso (CEO) and Kevin Stumpf (CTO). Prior to Tecton, Mike Del Balso was a Product Manager and Kevin Stumpf was an Engineering Manager for Machine Learning Platforms at Uber, where they developed the Michelangelo ML platform. Prior to Uber, Del Balso was previously a PM at Google, where he worked on the ML systems for Google’s ad auctions and experienced what MLOps was required to run Google-scale systems. Stumpf founded Dispatcher, an “Uber for long-haul trucking”, before joining Uber Freight and later its ML platform team.

At Uber, Tecton’s founding team was tasked with working on an ML platform to drive automated decision-making for user-facing products, what Tecton now terms “production ML”. Tecton CEO Mike Del Balso, in an interview with Contrary Research, noted:

“In 2015, the initial mission at Uber was democratized machine learning. There were no good ML platforms on the market. ML infrastructure was super nascent at that time.”

The team worked “with dozens of teams tackling hundreds of business problems with ML, from user experiences like ETA prediction to operational decisions like fraud detection.” They discovered that “production ML”, or operational ML use cases were high business value, but introduced a host of new challenges such as production-level SLAs, fresh data, and high-stakes.

Source: Tecton

Tecton’s founding team noticed a pattern while trying to get these production ML use cases up to speed and deliver real business value:

“Across these efforts, we kept seeing the same pattern: data science teams would build promising offline prototypes, but would end up taking months (or quarters) of back-and-forth with engineering to actually get things fully “operationalized” and launched in production.”

This led to the development of Michelangelo, released in 2017, an end-to-end ML platform with an integrated feature store and model management systems. The project led to an “explosion of ML deployments” and is supporting “the training and serving of thousands of models in production across the company.”

Michelangelo was an end-to-end platform, a design that was in vogue given other verticalized ML solutions such as DataRobot. However, the Tecton founders had a key insight: engineering requirements for ML systems in production were varied and modularization would be critical. Tecton CEO Mike Del Balso, in an interview with Contrary Research, said:

“What we learned at Uber Michelangelo was we need to modularize this thing. If you're the Uber Eats team or the self-driving car team, maybe you have your own special way to serve your models… So having these components be very modular and effectively breaking up the ML platform into different subcomponents that are plug-and-play was a pretty useful insight for us. The Uber team has published a blog post on this since then. But it's also what we've seen the industry converge to now. There's components, there's model serving system companies, there's monitoring companies, there's feature platform companies.”

One key component Michelangelo provided was a feature store, which managed a catalog of production-ready features that allowed teams to easily discover and re-use features as part of their own models. This was because at Uber they noted:

“Modelers spend most of their time selecting and transforming features at training time and then building the pipelines to deliver those features to production models. Broken data is the most common cause of problems in production ML systems.”

Tecton’s founders saw other companies building production ML systems had the same struggles with feature data. They noted that “most of that complexity originates in the data layer for ML (feature engineering, sharing, serving, monitoring, and much more), an area underserved by today’s ML tools.” After experiencing how feature stores transformed ML at Uber, they set out to democratize production ML.

Product

Source: Tecton

Feature Store

Source: Tecton

At the core of Tecton’s Feature Platform is the Feature Store, a component that stores and serves features, or inputs to machine learning models. A centralized feature store enables ML systems to scale across online and offline environments. By storing features across online inference environments and offline training environments, Tecton helps eliminate tricky bugs around “training-serving skew,” or when subtle differences between the features in the online vs. offline environment cause model performance to degrade.

An online store typically only stores the freshest features and can be scaled to serve thousands of requests per second. Tecton offers SLAs on availability and latency. An offline store typically stores large amounts of historical features in scalable data warehouses or data lakes such as Snowflake or S3. Tecton integrates with these providers and provides data scientists a notebook-friendly SDK.

In November 2020, Tecton announced that it became a core contributor to Feast, an open-source feature store, and Willem Pienaar, creator of Feast, joined the Tecton team. Feast was created in 2019 and was developed jointly by Gojek and Google Cloud. Microsoft, Agoda, Farfetch, and Zulilly have used the project.

With regard to Feast, Tecton committed to feature serving, definition, and store parity, ensuring a seamless migration from Feast to Tecton. Tecton positions Feast as a self-managed, open-source tool for teams just getting started or teams that need customization, whereas Tecton offers a managed enterprise offering.

Feature Repository

Source: Tecton

Tecton’s feature repository brings DevOps to ML features. By defining features as code, teams can now adopt DevOps best practices around code reviews, lineage tracking, and CI/CD integrations. This helps data scientists iterate faster and more reliably. Tecton’s feature repository not only stores transformation logic, but also metadata such as feature name and description, and configuration around how frequently the feature is re-computed. This makes features discoverable and helps control compute costs.

Feature Pipelines

Source: Tecton, Contrary Research

Tecton abstracts away the complexity of batch and streaming transformations, providing users with a consistent interface to define features. Streaming data is fresher than batch data and can provide significant lift to model performance. While many data science teams have existing procedures for batch transformations, managing transformations on streaming data is often difficult. Tecton stores the feature transformation logic on its platform and integrates with data sources to provide the transformation logic 1:1 to the data source’s processing engine.

Monitoring

Source: Tecton

Tecton supports operational monitoring for feature storage (availability, capacity, utilization) and metrics related to feature serving (throughput, latency, queries per second, error rates). Tecton also supports data quality monitoring, allowing data scientists to detect any changes in distribution or the quality of incoming data, and proactively detect model performance issues.

Market

Customer

Tecton targets companies with relatively mature ML teams. Tecton CEO Mike Del Balso, in an interview with Contrary Research, noted that “Generally it's people who have needs for building production ML and they have fast data and they're completely on the cloud. So if you're on-prem, we just don't do that.”

Tecton prioritizes high-value, real-time, production ML applications, such as fraud detection, recommendation systems, and dynamic pricing. Del Balso went on to describe these use cases in an interview with Contrary Research:

“If this is not an important use case for you, it still takes a lot of work to put these ML systems in production. And if it's not important it's probably not going to be successful. So we spend time with people for whom machine learning-driven product experiences are a priority.”

Tecton CEO Mike Del Balso has said that “The top use case for us is recommendation… Anyone who has browsed products on an ecommerce site or scrolled through movie options on a content streaming platform knows the micro-frustrations resulting from systems that don’t recognize their most recent moves.”

These applications present greater technical challenges than analytical ML applications and often require orchestration of batch, streaming, and real-time data sources, requiring data engineers to support complex, decentralized pipelines. As Tecton CTO Kevin Stumpf notes, “we’ve talked to many teams who needed 2 data engineers to support 1 data scientist without a feature platform. A feature platform allowed them to more than flip that ratio.” Organizations experiencing this pain are motivated to move to a centralized ML platform.

Further indicators of ideal customers are those who:

  • Have experienced the hand-off process between data scientists and data engineers and the pain associated with re-implementing data pipelines for production.

  • Are deploying operational machine learning applications, which need to meet strict SLAs, achieve scale, and can’t break in production.

  • Have multiple teams that want to have standardized feature definitions and want to reuse features across models.

Tecton targets companies that have adopted cloud-based solutions, eschewing companies that are exclusively on-prem. Tecton doesn’t target a specific size, having partnered “with customers ranging from small growth-stage startups to large Fortune 50 companies, across North America and Europe.” Notable customers that use Tecton at production scale include Atlassian, HelloFresh, Tide, Vital, and FanDuel.

For many organizations, the default provider for a feature store or feature platform is their cloud provider, which may offer integrated ML services; however, some organizations are looking for best-of-breed solutions. Larger tech companies may already have built in-house feature stores or ML platforms, reducing the need for a solution like Tecton. ML Researcher Chip Huyen notes that Linkedin, Airbnb, Instacart, Doordash, Snap, Stripe, Meta, Spotify, Uber, Lyft, Pinterest, Twitter, Gojek, and Etsy all have in-house feature stores or ML platforms. Tecton CEO Mike Del Balso, in an interview with Contrary Research, noted:

“Facebook or Google literally have hundreds of people working on the data infrastructure for the real-time ML application. We're basically building that team for everybody. Your favorite insurance company may not be able to hire hundreds of Google engineers for this, but they'll be able to work with Tecton and have those same capabilities.”

Tecton sells mainly to the engineering or data science leaders and has multi-month enterprise sales cycles. For example, at HelloFresh the data science team first had to gather requirements: “From February to March 2022, the groups engaged in an agile process designed to not only understand what part of the puzzle a feature store could solve, but also to create buy-in of a solution across the organization.” After comparing vendors, HelloFresh took 6 months to incrementally roll out Tecton to an initial set of eight teams.

Source: Tecton

Market Size

The market for MLOps is nascent and highly fragmented, estimated at $612 million in 2021 and projected to reach $6 billion by 2028. Gartner indicated in 2022 that it is tracking over 300 MLOps companies. Enterprises often use both a cloud provider and specific MLOps solutions, as Gartner Research indicated:

“Most enterprises create a hybrid strategy. That is, they use Amazon, Azure or Google as a backplane, depending on their enterprise center of gravity, and plug in components where capabilities might be missing or where they need hyper-customization or specialization.”

ML researcher Chip Huyen also notes 284 MLOps tools in 2020, of which 180 are startups. Most notable AI startups are application or vertical AI – for example, among the Forbes 50 AI startups in 2019, seven companies are tooling companies. Given the increasing focus on data-centric AI, more tools are being created for data labeling, management, and more. Of the MLOps startups that raised money in 2020, 30% were in data pipelines.

Competition

Build vs. Buy

Tecton’s biggest competitors are in-house solutions, as feature stores have become an industry standard component. The Tecton-supported open-source feature store Feast is used at Gojek, Shopify, Salesforce, Twitter, Postmates, and Robinhood. Feast is often good enough for customers just getting started with their ML infrastructure or needing customization.

Established technology companies often have in-house solutions. For example, Stitch Fix has open-sourced Hamilton, its own framework for data pipelines. ML Researcher Chip Huyen notes that Linkedin, Airbnb, Instacart, Doordash, Snap, Stripe, Meta, Spotify, Uber, Lyft, Pinterest, Twitter, Gojek, and Etsy all have in-house feature stores or ML platforms.

However, building and maintaining an internal feature store is costly and often not a core competency. Tecton CEO Mike Del Balso, in an interview with Contrary Research, noted:

“So many of the use cases that we work on, they're not new use cases… A top fintech company is our customer, and they have four different versions of this system that they've tried to build at different points in time over the past ten years. And they have this graveyard of internal things that support something, but they hate it, and it's a big maintenance burden. And so they're throwing their hands up and just want to buy this thing. They’ll wonder why they have 20 people working on this when its not a strategic thing.”

The cost equation makes more sense the more complex the feature platform requirements are. Organizations experiencing this pain may be motivated to move to a centralized ML platform. However, cost could slow down adoption and drive users to open-source solutions. For example, a Principal Machine Learning Engineer at a large logistics company noted:

“Tecton was checking all the boxes, except price… something like $10K a month per model. It was something completely cost prohibitive for us to use Tecton at the time. Maybe at some point in time, we could jump from Feast to Tecton if we needed to, and the ROI on having the vendor in-house would make sense.”

Cloud & Data Platforms

By default, customers may turn to existing cloud providers such as AWS or data warehouse/data lake providers such as Snowflake or Databricks. Major providers recognize the demand for feature stores. Amazon announced the SageMaker Feature Store in 2020, Google announced Feature Store as part of its ML platform Vertex AI in 2021, and Databricks announced its Feature Store in 2022.

These solutions don’t provide many of the capabilities Tecton has. For example, as of 2022, Sagemaker doesn’t provide tooling for transforming streaming or real-time data, unification of historical data, or feature governance. However, these solutions are attempting to build out critical parts of Tecton’s platform, such as streaming data pipelines. Pipelines enable real-time data, such as credit-card swipes or user logins, to flow into features and then models. Snowflake has streaming tools Snowpipe and Materialized Views, and Databricks CEO Ali Ghodsi has said:

“We're seeing huge demand for streaming. It's finally real, and it's going mainstream. We're going to see it grow even more because there are so many low-hanging fruits we've plucked… More and more people are doing machine learning in production, and most cases have to be streamed.”

Database & Streaming Providers

Database or streaming solution providers aim to solve the core problem of building real-time feature pipelines and feature stores. However, they do not typically provide the other solutions around feature governance and orchestration that Tecton provides as a feature platform.

Datastax, founded in 2010, offers a NoSQL database solution based on Apache Cassandra and was valued at $1.6 billion on June 2022. Datastax’s streaming database powers real-time applications for customers like Home Depot, Verizon, and Capital One. Datastax has rebranded itself as a Real-Time AI company, announcing on January 12, 2023 a new mandate: “Real-Time AI for Everyone.”

Source: Datastax

Materialize, founded in 2019, offers a streaming SQL database, and raised a $60 million Series C in September 2021. Materialize aims to make it easy for businesses to use streaming data in real-time. Materialize’s streaming database is actively marketed as a real-time feature store.

Source: Materialize

Business Model

Tecton does not have any publicly available information on pricing, though some user interviews have indicated the company’s pricing is likely a platform and implementation fee with some component of usage-based pricing. The company has indicated that its gross margins are above 80%, compared to companies like Snowflake whose gross margins are ~65%. That indicates the company is likely charging for usage of compute, otherwise purely subscription contracts would otherwise have lower gross margins as customers scale up.

Traction

Tecton doesn’t target a specific customer size, having partnered “with customers ranging from small growth-stage startups to large Fortune 50 companies, across North America and Europe.” Notable customers that use Tecton at production scale include Atlassian, HelloFresh, Tide, Vital, and Fanduel.

Tecton has said it has “hundreds of active users” and announced, as part of its Series C in July 2022, that over the previous year its customer base had grown 5x, and ARR had grown 3x. At that same time, some reports indicated that Tecton had less than $6 million ARR.

Tecton has also been recognized as a Gartner Cool Vendor and became the first sponsors of the MLOps community which had grown to 7.5K members in 2021. In addition, Tecton hosts its own apply() conference which brought 2.5K ML practitioners in 2021 together to discuss production ML.

Valuation

Tecton raised a $100 million Series C in July 2022, bringing its total raised to $160 million. The round was led by Kleiner Perkins, with participation from previous investors Andreessen Horowitz, Sequoia Capital, Bain Capital Ventures and Tiger Global. Notable strategic investors included Snowflake and Databricks.

Some reports later indicated that the round was done at a $900 million valuation. Based on reported ~$6 million ARR, that would represent a 150x current ARR multiple at the time. Comparable later-stage companies, like Databricks and Snowflake, were valued at 27x revenue in late 2022, and 23x LTM revenue as of August 2023 respectively.

Weekly Newsletter

Subscribe to the Research Rundown

Key Opportunities

Rise In Real-Time ML

The pressure to personalize customer experiences and accelerate decision-making is pushing companies to invest in real-time ML applications. Tecton’s focused execution could help them become a go-to provider in this space. Tecton CEO Mike Del Balso, in an interview with Contrary Research, noted:

“This isn’t some offline analytical machine learning exercise. If it breaks, you can’t just hit retry. This domain of software that we're talking about, that's not acceptable. There's no retry, there's nobody in the loop. Running in that domain is substantially different from what everyone else is doing, but there's a growing demand for it and the market is moving to that. It's characterized by scale, speed, real-time, in production, and reliability. And there's fundamentally different technical needs to support that, different architectural patterns and different things that people are actually doing with machine learning for those use cases.”

Datastax “State of the Data Race 2022” surveyed over 500 CIOs and technology leaders and found that 78% of respondents described real-time data as a “must-have.” As Forbes noted when Datastax acquired Kaskada, a real-time ML company, “The Real-Time AI Data Race Is On.”

Becoming A Broader Data & ML Platform

With the feature platform, Tecton now owns some of the highest value signals in a business, and the logic to create them. Tecton could extend further up the stack into model management and retraining, enabling powerful new ML paradigms such as continual learning, where models continuously adapt to incoming data. Tecton could also further extend into the experimentation and feature engineering phase, where companies like Weights and Biases now focus.

Tecton CEO Mike Del Balso, in an interview with Contrary Research, noted:

“It's kind of obvious that if you need a cloud data platform, you're pretty confident that if you use a Snowflake or a Databricks, that could probably solve most of your problems. We're going to get to the point where if you're building a data application with ML in production, it's just going to be obvious that you use Tecton. So that involves expanding beyond the feature platform and doing it very intentionally.”

Key Risks

Early Innings for Real-Time

Bain Capital Ventures Partner Aaref Hilaly, an investor in Tecton, notes “It’s just a matter of time until organizations, large and small, integrate real-time predictive applications into their everyday operations.” But that evolution may not happen imminently.

Tecton runs the risk that real-time AI adoption is still early in its lifecyle. Even the largest, most sophisticated technology companies may take 5 years to migrate from batch to real-time solutions, as Kaskada co-founder and CEO noted in January 2023, “just recently, industry leaders in this space have concluded that real-time ML is the way to go, something that required 5-7 year journeys from batch to real-time ML for leaders like Netflix and Instacart.”

Vendor Consolidation

Given the economic downturn in 2022, customers started looking to consolidate vendors. As a result, some customers may stick with existing providers. The former CTO of one startup noted:

“Because [our company] was early in its journey, we tried to put everything into SageMaker… And one of the goals that I took on was to actually reduce the number of tools and consolidate the platforms we were using. So that was actually one of the drivers for not using any more new tools. We wanted to leverage fewer tools to get kind of the economies of scale benefit from a cost perspective.”

Summary

Tecton aims to make it easier to get business-critical, real-time ML systems into production with its unified feature platform. The pressure to personalize customer experiences and combat issues like fraud means more companies than ever are looking to develop real-time AI applications. Tecton aims to be a trusted partner and key provider in this space by centralizing, managing, and orchestrating the highest-value signals, the features used as inputs into ML models.

Tecton is not alone in recognizing this opportunity, and is competing in a nascent and fragmented MLOps market with over 300 providers. Cloud platforms such as AWS Sagemaker and data platforms like Databricks are also building feature stores that may be “good enough” for customer needs. Tecton’s focused execution will be necessary to stand out in a crowded field.

Disclosure: Nothing presented within this article is intended to constitute legal, business, investment or tax advice, and under no circumstances should any information provided herein be used or considered as an offer to sell or a solicitation of an offer to buy an interest in any investment fund managed by Contrary LLC (“Contrary”) nor does such information constitute an offer to provide investment advisory services. Information provided reflects Contrary’s views as of a time, whereby such views are subject to change at any point and Contrary shall not be obligated to provide notice of any change. Companies mentioned in this article may be a representative sample of portfolio companies in which Contrary has invested in which the author believes such companies fit the objective criteria stated in commentary, which do not reflect all investments made by Contrary. No assumptions should be made that investments listed above were or will be profitable. Due to various risks and uncertainties, actual events, results or the actual experience may differ materially from those reflected or contemplated in these statements. Nothing contained in this article may be relied upon as a guarantee or assurance as to the future success of any particular company. Past performance is not indicative of future results. A list of investments made by Contrary (excluding investments for which the issuer has not provided permission for Contrary to disclose publicly, Fund of Fund investments and investments in which total invested capital is no more than $50,000) is available at www.contrary.com/investments.

Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by Contrary. While taken from sources believed to be reliable, Contrary has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Please see www.contrary.com/legal for additional important information.

Authors

Alex Liu

Senior Fellow

See articles

© 2024 Contrary Research · All rights reserved

Privacy Policy