Cockroach Labs

Cockroach Labs is a software company that develops a cloud-native, distributed SQL database for modern cloud applications. CockroachDB combines the horizontal scalability of NoSQL databases with the transaction support of traditional SQL databases into one platform. CockroachDB is used for online transactional processing (OLTP) and application development.

Founding Date

Feb 1, 2015

Headquarters

New York, New York

Total Funding

$ 633M

Stage

series f

Employees

251-500

Careers at Cockroach Labs

Memo

Updated

September 14, 2023

Reading Time

18 min

Thesis

In 1970, Edgar F. Codd, an English computer scientist for IBM, introduced a new way to model data, describing a ‘relational database’ as a series of cross-linked tables. In 1979, Oracle brought the first database to the commercial market and was soon followed by DB2, SAP, Informix, and others. Throughout the 1980s and 1990s, relational databases became commonplace, and data scientists relied on the structured query language (SQL) to ask the database for what they wanted. However, these databases were designed before the internet reached wide adoption and thus were not architected for scale. As millions of Internet users came online in the 2000s, databases encountered issues because the volume of data being produced was more than any single server could handle. As a result, by the late 2000s, developers shifted to NoSQL databases and moved from a single database server to a clustered node of servers.

NoSQL databases scale horizontally, meaning developers add additional computers rather than adding additional resources to a single system (”vertical scaling”). The tradeoff is that horizontally-scaled NoSQL databases can compromise data integrity, have limited functionality, and limit indexes, potentially causing queries to return inaccurate data. For example, Google frustrated many of its engineers when it switched to NoSQL for AdWords due to the limitations of NoSQL databases. Google responded in 2012 by developing Spanner, the first horizontally scalable SQL database. While developments have been made in the architecture of databases, the issue of scalability is still relevant in 2023: the number of internet users has reached more than 5 billion with the average user online for almost seven hours per day. Further, databases are transitioning to the cloud to eliminate the need for physical infrastructure, creating an opening for newer developments in database infrastructure.

Inspired by Spanner, Cockroach Labs developed CockroachDB, a horizontally scalable SQL cloud database. CockroachDB aims to have the functionality of vertically-scaled SQL databases, the scalability of horizontally-scaled NoSQL databases, and the benefits of cloud infrastructure. The database is suitable for online transactional processing (OLTP) rather than online analytical processing (OLAP). While OLAP is built for complex analytics, OLTP powers everyday business operations such as payments and reservations. With the goal of combining high data integrity and horizontal scalability, the company aims to compete against both traditional SQL databases and NoSQL databases. Overall, CockroachDB is aiming to solve the complex database scalability issues relevant in the next decade and assist customers with the transition to the cloud.

Founding Story

Cockroach Labs was founded in 2015 by Spencer Kimball (CEO), Peter Mattis (CTO), and Ben Darnell (Chief Architect). Kimball and Mattis met at UC Berkeley in the 1990s. They worked together as students to create an open-source graphics program, GIMP. The program was so successful that when Kimball and Mattis joined Google in 2002, Larry Page and Sergey Brin stopped by themselves to tell the new hires that they used GIMP to create the first Google logo.

Kimball and Mattis became rising stars at Google and the go-to members of the infrastructure team. During their tenure, they met Ben Darnell. While the three could have remained in lucrative careers at Google, they shared a mutual ambition to start their own company. Mattis noted that he was “too comfortable [at Google]. I didn’t feel like I was being challenged and it felt like I was stagnating.”

While still at Google, Kimball, Mattis, and Darnell realized that databases were struggling with the enormous amount of AdWords data. Google attempted to address its data overload issues by creating Spanner, a globally distributed relational database that inspired CockroachDB. During their time at Google, the founders of Cockroach Labs also regularly interacted with the team that was building Spanner.

In 2012, Kimball, Mattis, and Brian McGinnis founded ViewFinder, a mobile application for photo-sharing. During their tenure at ViewFinder, Kimball and Mattis ran into similar database issues that they had previously seen at Google. At this time, the two had the initial idea for what would become CockroachDB. However they did not have conviction that there would be a market for databases, so they shelved the idea and used Amazon DynamoDB.

In 2013, Block (formerly Square) acquired Viewfinder. At this time, the team again ran into the same database issues. Kimball and Darnell began developing open-source database technologies. The two published an open-source document describing a global relational database with transaction support. After receiving VC interest in their open-source work, Kimball and Darnell incorporated Cockroach Labs in 2015 based on the ideas expressed in the document. A few months later, Mattis left Square to join the newly formed company as a co-founder.

Kimball named the company after cockroaches in 2012 while at ViewFinder, when the team had the initial idea for the company.

“I imagined [the right database] would be composed of symmetric nodes, require no external dependencies, spread itself naturally across availability zones for survival. Each node would autonomously replicate and repair data. These were the capabilities that led me to the name “cockroach”, because they’ll colonize the available resources and are nearly impossible to kill.”

Weekly Newsletter

Subscribe to the Research Rundown

Product

CockroachDB Architecture

CockroachDB’s architecture consists of a distributed cluster of nodes. Each node stores a subset of data and replicates data to other nodes to preserve it. Communication between nodes ensures resilience and allows any node to receive any request. The company has also documented the technical details of CockroachDB.

Geographic Data Partitioning

Geographic data partitioning reduces latency by placing users’ data near them. Consider a user based in Iowa with their data in either Dallas or Singapore. Latency will be much higher if their data is in Singapore because of the greater distance.

As part of a test on Google Cloud Platform’s network, round-trip latency from Iowa to Dallas averaged 17 ms compared to 245 ms from Iowa to Singapore. With CockroachDB’s geo-partitioning, companies can store the user’s data in Dallas instead of Singapore, resulting in lower latency and a better user experience.

Geo-partitioning also eases compliance with data privacy regulations, such as the EU’s GDPR. One provision of GDPR requires companies to store the data of EU citizens either in the EU or in an approved country with similar data privacy protections. CockroachDB’s geo-partitioning enables compliance by giving users control of where data is stored. Companies can avoid infringement fines of up to €20 million or 4% of global annual revenue by using CockroachDB.

SQL Interface

SQL has existed since the 1970s and features simple yet powerful syntax. Over 50% of all developers used SQL as of September 2022, which made it the second-largest programming language after JavaScript. SQL is also the most popular database language as of February 2023. CockroachDB’s SQL interface reduces the friction for companies to switch to CockroachDB.

Cockroach’s programming language of SQL supports the PostgreSQL wire protocol and over half of syntax its syntax, meaning applications built on PostgreSQL can often be migrated to CockroachDB without changing the application code. This compatibility is important because PostgreSQL is the second most popular database language among developers. However, some CTOs report that CockroachDB’s compatibility is less than ideal compared to an alternative like PostgreSQL.

Transactions

CockroachDB supports consistent ACID (Atomicity, Consistency, Isolation, and Durability) transactions. Support for ACID transactions makes CockroachDB suitable for online transactional processing (OLTP). OLTP queries involve small amounts of data and power in everyday business operations. Examples of OLTP tasks include storing a single customer’s purchase or retrieving their email address. CockroachDB handles OLTP queries by executing single-row reads in under 2ms and single-row writes in under 4ms.

In contrast, OLAP (online analytical processing) involves analyzing huge datasets to extract business insights on data warehouse platforms like Snowflake. For instance, a company may aggregate customer purchases over the past few quarters to analyze revenue changes. OLTP’s simpler queries may require as little as one row of data whereas OLAP’s complex queries involve many rows of data. Due to this, CockroachDB is not suitable for OLAP.

Availability

The number of failures a CockroachDB deployment can tolerate is equal to ((replication factor - 1) / 2). Replication factors are the number of different nodes copied on each piece of data. For example, with a replication factor of three, one copy of a range of data can fail with no permanent data loss. When a node fails, the cluster automatically repairs itself by replicating the failed node’s data to other nodes. CockroachDB’s architecture also enables zero downtime database updates and schema changes, two features lacking in traditional relational databases.

Scalability

CockroachDB scales horizontally, whereas traditional SQL databases scale vertically. Horizontal scaling refers to adding computers (nodes) to support more heavy usage. Vertical scaling involves adding additional resources, such as processing power and memory, to single nodes to run queries.

Vertical scaling faces limitations because only so many resources can be added to a single node. Horizontal scaling becomes necessary to overcome the limits of vertical scaling. Horizontal scaling is possible with traditional SQL databases with sharding but adds increased complexity. With CockroachDB, users can scale horizontally without adding complexity.

Vendor Flexibility

Users can deploy CockroachDB on any cloud platform, unlike databases offered directly by cloud vendors such as AWS Aurora and GCP Spanner. Cockroach can also be deployed on multiple clouds at once such as AWS, GCP, and Azure. Multi-cloud deployments prevent dependence on a single cloud provider (i.e. vendor lock-in).

With vendor lock-in, companies face high switching costs when moving to another cloud vendor. The vendor can raise prices, lower service quality, or change their offerings altogether without being concerned about locked-in customers leaving.

Deployment Options

CockroachDB Serverless: CockroachDB Serverless scales automatically to meet database demand. The product is fully managed and has automatic transactional database capacity scaling, high availability, and data protection. However, it does not offer geographic data partitioning since it lacks dedicated computers. The auto-scaling and free tier make it ideal for small-scale projects and individual developers.

CockroachDB Dedicated (Database-as-a-Service): CockroachDB Dedicated, the Database-as-a-Service (DBaaS) product, offers a dedicated, fully managed cloud cluster. Users can select a cloud size and provider of their choice (AWS or GCP). The scale is node based and self-serviced. Service availability is guaranteed 99.99% uptime with configurable data replication within or across regions, provided by Cockroach Labs SRE. CockroachDB Dedicated could be considered by enterprises with varying workloads, applications that need to grow and scale over time, and applications that require real-time integration with other systems.

CockroachDB Advanced: CockroachDB Advanced has all of the same features as the Dedicated deployment, but in addition is PCI-compliant. The additional features to make the Advanced deployment PCI-compliant include an audit log, customer-managed encryption keys, egress perimeter controls, single sign-on authentication, and network security. The Advanced deployment option is ideal for applications that require PCI-readiness.

CockroachDB Self-Hosted: Customers can self-host the full-featured enterprise version of CockroachDB. These organizations can deploy CockroachDB on their own infrastructure or in the cloud. Large enterprises, like banks or government organizations facing regulatory requirements to store data on-premise, may consider using CockroachDB Self-Hosted.

Market

Customer

Cockroach Labs’ ideal customers are enterprises that process large transaction volumes and can’t afford to have downtime. Examples include multinational financial institutions and online retailers. Notable customers include Lush Cosmetics, Shipt, Comcast, and Hard Rock Digital.

CockroachDB’s ACID compliance and geographic data partitioning target businesses processing financial transactions. ACID compliance helps process financial transactions without mistakes. For instance, sequential transaction processing prevents transferring money from a recently emptied bank account.

Traditional SQL databases also offer ACID transactions but typically do not match CockroachDB’s geo-partitioning. Geo-partitioning is an essential feature because of global companies’ widespread user bases. It ensures prompt responses to user requests regardless of where they originate. It also eases compliance with data storage regulations.

Market Size

The global relational management system market reached $68.6 billion in 2022. The market is forecast to grow at a CAGR of 11.7% to $133.4 billion by 2028. Rapid growth in the cloud segment equalized revenue from cloud and on-premise databases. The two largest drivers for the industry growth are the growing importance of cloud databases and the increasing importance of data to enterprises.

NoSQL databases grew faster than relational databases from 2013 to 2023. NoSQL databases’ superior scalability and flexibility relative to traditional SQL databases contributed to driving NoSQL adoption. The chart below shows that among NoSQL databases, graph databases increased by over 12x, and document stores increased by over 5x over the last decade:

Source: DB Engines

As of July 2023, relational databases still dominate the market with nearly 72% market share. The rise of horizontally scalable relational distributed SQL databases may deter further NoSQL adoption, creating more opportunities for Cockroach Labs to grow.

Competition

Direct Competitors

PlanetScale: PlanetScale offers a MySQL-compatible serverless database. The company built the database on Vitess, an open-source technology that horizontally scales MySQL. MySQL is the most popular SQL dialect. A few prominent companies using Vitess include YouTube, Slack, and Square. In November 2021, PlanetScale announced that it announced that it raised a $50 million Series C at an undisclosed valuation led by Kleiner Perkins. The company was founded in 2018 and has raised approximately $105 million in funding.

Yugabyte: Yugabyte’s YugabyteDB shares key features with CockroachDB, such as PostgreSQL compatibility, distributed ACID transactions, and geographic data partitioning. CockroachDB and Yugabyte have written about why their respective databases perform better than the others. Each database outperforms the other on certain tasks. Yugabyte claims its database delivers 3.5x higher throughput and 3x lower latency compared to CockroachDB. Cockroach Labs claims its databases produce fewer errors and can perform better across all possible workloads compared to Yugabyte databases.

In October 2021, Yugabyte raised a $188 million Series C led by Sapphire Ventures. This funding round valued the company at over $1.3 billion. The company was founded in 2016 and has since raised $291 million in total funding.

PingCAP: PingCAP’s TiDB is a MySQL-compatible database. It supports both transactional and analytical use cases, making it a hybrid transactional and analytical processing (HTAP) database. Without HTAP, companies must maintain separate OLTP and OLAP systems; to perform complex analytics on data stored in the transactional database, the company must first copy it to an analytical database in a process called extract, transform, and load (ETL).

The time required for the ETL process delays analytics. HTAP databases eliminate the need to maintain a separate database and enable real-time analytics. Applications involving fraud detection, recommendation engines, and internet of things (IoT) devices often leverage real-time analytics and can benefit from HTAP databases. In November 2020, PingCAP announced the closing of a $270 million Series D at an undisclosed valuation led by GGV Capital, Access Technology Ventures, Anatole Investment, Jeneration Capital, and 5Y Capital. Since founding in 2015, the company has raised a total of approximately $342 million.

Traditional Relational Databases

Proprietary relational databases (Oracle Database, Microsoft SQL Server) and open-source relational databases (MySQL, PostgreSQL) run on single nodes. These platforms scale vertically by adding computing resources instead of horizontally by adding additional nodes.

Although they offer transactions and a SQL interface, they possess several drawbacks compared to CockroachDB. They require manual repair and offline upgrades/schema changes, leading to expensive downtime; Gartner estimates IT downtime to cost $5.6K per minute. Another drawback is the lack of geo-partitioning, resulting in high latency for users far away from the database.

Cloud Platforms

Cloud platforms like Amazon Web Services (AWS), Google Cloud Computing (GCP), and Microsoft Azure offer various types of databases, including distributed SQL databases, traditional SQL databases, and NoSQL databases. If customers use one of the popular cloud platforms for other tech needs, they may be more likely to use it for their database. For example, GCP offers the proprietary Spanner that inspired CockroachDB, however, Spanner can only run on GCP computers. CockroachDB can run on any computer to provide companies with more flexibility.

Non-Relational Databases (NoSQL)

NoSQL databases handle unstructured data better than relational databases due to not enforcing a strict schema. Similarly to CockroachDB, they are distributed and horizontally scalable, and some, like MongoDB, offer geo-partitioning. However, ACID transactions are often not supported, and retrieved data is not always guaranteed to reflect the latest update. Not receiving the latest update causes relevant parties to get outdated information. Examples of NoSQL options include RavenDB, Redis (key-value database), Apache Cassandra (columnar database), MongoDB (document-oriented database), and Neo4j (graph database).

Business Model

Cockroach Labs operates on a freemium model and charges customers based on usage. The cloud business (Dedicated and Serverless) grew 500% in the last quarter of 2021 alone and accounts for over 50% of the company’s customers. Besides selling directly to companies, Cockroach also partners with resellers, including DataHub, Infofabrica, and Untwine AI.

The company offers the following pricing tiers:

CockroachDB Serverless: CockroachDB Serverless’s free tier includes 10 GiB storage and 50 million request units per month. After, the company charges $0.50 per additional GiB storage and $0.20 per additional million requests per month.

CockroachDB Dedicated: CockroachDB Dedicated’s pricing begins at $295 per month. This includes 15 GiB to 10K storage per available node. As of September 2023, users can choose between AWS, GCP, and Azure. The AWS and GCP options charge more for computing, input/output operations per second, and storage, making Azure the cheapest option. In addition, cloud region can influence the starting price per hour.

CockroachDB Self-Hosted: Cockroach Labs does not publicly provide pricing for CockroachDB Self-Hosted on its website. Most customers choosing self-hosted deployment are likely large enterprises, where pricing is custom and negotiable for each client.

Traction

In 2021, Cockroach Labs tripled its annual recurring revenue with over 200 customers and 10K deployed clusters. Some sources estimate the company’s database market share to be at approximately 0.1%, and that its market share is becoming more globally dispersed. EMEA revenue grew 600% YoY in the first quarter of 2022 and is projected to comprise 20% of total company revenue by the end of 2022.

In April 2022, the company announced plans to open new offices in Europe, the Middle East, and Africa (EMEA). In June 2023 the company opened engineering offices in India and announced plans to expand its go-to-market to Asia Pacific and Latin America. Notable customers of Cockroach Labs are Netflix, Bose, Shipt, and Nubank. However, some companies use CockroachDB for particular use cases, not the core product. For example, Netflix uses CockroachDB to store users’ device information.

Valuation

As of September 2023, Cockroach Labs has raised a total of approximately $633 million. Most recently, the company raised a $278 million Series F at a $5 billion valuation in December 2021. The Series F was led by Greenoaks, and other notable investors include Benchmark, Index Ventures, and Sequoia. Cockroach Labs more than doubled its valuation in 2021 after starting the year with a $160 million Series E at a $2 billion valuation led by Altimeter Capital. At the time of the company’s Series F, Neil Mehta, the founder and managing partner of Greenoaks, noted:

“Cockroach Labs is at the forefront of this opportunity for transactional data. They are an innovator and leader, fundamentally reshaping the way enterprises manage their data by offering a remarkably easy—and dramatically cheaper—way to operate the database for critical applications. We look forward to seeing how they take charge of the market.”

Key Opportunities

Transition to Cloud Databases

Databases are often sticky products with high switching costs. However, many companies with non-cloud databases are looking to transition to the cloud. Cloud databases provide scale, resilience, and geographic data distribution. They also enable developers to focus on core products instead of database infrastructure.

The cloud database market is projected to grow at a 24.9% CAGR from $10.1 billion in 2022 to $47.9 billion by 2029. Cloud databases such as Cockroach Labs can acquire new customers during the transition to the cloud. Once the cloud transition is complete, the high switching costs will increase the lifetime value of Cockroach Labs’ customers.

Rising Geographic Dispersion

As more parts of the world come online, they’ll continue to generate greater volumes of data. That growth brings opportunities to improve compliance and performance issues for online applications.

Source: Statista

CockroachDB capitalizes on the rising geographic distribution of internet users through its geographic data partitioning feature. It enables a low-latency experience for users regardless of location while also offering compliance with data privacy regulations.

Key Risks

Data Security Requirements

With cloud databases, businesses rely on cloud vendors for robust security policies and technology. Businesses may prefer on-premise databases for direct control over their data security. They may especially want to hold sensitive data, such as intellectual property or personal information, to higher security standards.

Data security regulations can also make the cloud transition more difficult. In 2020, the US Federal Financial Institutions Examination Council (FFIEC) issued guidelines for financial institutions to manage cloud data security risks. The guidelines could become formal regulations, making it more difficult for Cockroach Labs and other cloud database vendors to acquire customers.

Data Migration Challenges

Data migration requires significant time and money, which may deter companies from switching databases. As of 2021, 75% of cloud migrations go over budget and 38% finish behind schedule. Companies are expecting demand for 1 million more cloud developers by 2024; the expected labor shortage may further increase data migration costs and delay cloud transition in the database market.

Weekly Newsletter

Subscribe to the Research Rundown

Summary

Cockroach Labs has built CockroachDB, a cloud-based distributed SQL database. CockroachDB combines the horizontal scalability of NoSQL databases with the transaction support and data integrity of traditional SQL databases. The company provides low-latency user experiences, especially for international users, and assists with data privacy regulation compliance. With databases transitioning to the cloud, Cockroach Labs is positioned for growth. The cloud database market is rapidly growing and could reach $47.9 billion by 2029. If the company succeeds, the database market’s traditional stickiness (high switching costs) could prove a powerful moat.

Disclosure: Nothing presented within this article is intended to constitute legal, business, investment or tax advice, and under no circumstances should any information provided herein be used or considered as an offer to sell or a solicitation of an offer to buy an interest in any investment fund managed by Contrary LLC (“Contrary”) nor does such information constitute an offer to provide investment advisory services. Information provided reflects Contrary’s views as of a time, whereby such views are subject to change at any point and Contrary shall not be obligated to provide notice of any change. Companies mentioned in this article may be a representative sample of portfolio companies in which Contrary has invested in which the author believes such companies fit the objective criteria stated in commentary, which do not reflect all investments made by Contrary. No assumptions should be made that investments listed above were or will be profitable. Due to various risks and uncertainties, actual events, results or the actual experience may differ materially from those reflected or contemplated in these statements. Nothing contained in this article may be relied upon as a guarantee or assurance as to the future success of any particular company. Past performance is not indicative of future results. A list of investments made by Contrary (excluding investments for which the issuer has not provided permission for Contrary to disclose publicly, Fund of Fund investments and investments in which total invested capital is no more than $50,000) is available at www.contrary.com/investments.

Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by Contrary. While taken from sources believed to be reliable, Contrary has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Please see www.contrary.com/legal for additional important information.

Authors

Rohan Gupta

Fellow

See articles

Spencer Stewart

Contributor

See articles

© 2024 Contrary Research · All rights reserved

Privacy Policy