Thesis
Developer productivity defines the speed at which products can be built. Increasingly, leading tech companies such as Uber, Amazon, Facebook and Gitlab are looking to track and improve developer productivity by understanding dependencies, identifying the root cause of an outage, onboard new engineers onto the team, and eliminating meeting time spent gathering context.
The market for software development tools is projected to be worth $733.5 billion by 2028, with key growth drivers including (1) demand for work efficiency, (2) a growing trend of cloud-based solutions and (3) increase in the usage of the Internet of Things (IoT) technology.
Enter Sourcegraph. Founded in 2013, Sourcegraph offers universal code search, batch changes, and queryable code insights to improve developer productivity and happiness. Its cloud and self-hosted products allows engineers to search across every repository and code host, improving the developer experience and velocity.
Founding Story
Sourcegraph co-founder and CEO Quinn Slack taught himself how to code at the age of nine. As an autodidact he was in the habit of figuring things out for himself instead of asking others for help, and learned how to code by reading code on the internet. He set up code search tools to browse code and see example use cases, and loved that he could learn that way.
CTO Beyang Liu, Sourcegraph’s other co-founder, started his career as an engineer at Google. Liu was accustomed to Google’s internal code search product, appropriately named Code Search, and assumed everyone used code search for learning and faster development.
Slack and Liu met while working at Palantir together where every time they wanted to understand how a specific API worked, change something, or understand what could break, they would have to set up a meeting. It was then that the founders realized access to code search wasn’t the standard. As a result, they decide to start Sourcegraph in 2013 with the goal of bringing code search to every company and developer.
Product
Sourcegraph has several key products:
Code Search: Search across multiple repositories and languages.
Batch Changes: Bulk changes across repositories.
Code Insights: Queryable insights directly in a codebase.
Cloud: A dedicated Sourcegraph Cloud instance with no manual deployment.
Code Search
Code Search is Sourcegraph’s core product. It gives users the ability to connect and search across thousands of repositories, finding code snippets in any host, language, or repository. This provides three main functionalities:
Find and fix code snippets in any host, language, or repository
Understand code and its dependencies faster, enabling faster code reviews and faster ability to determine a root cause through tracked dependencies.
Create documentation and embed documentation that references live code
Source: Sourcegraph
Batch Changes
Batch changes gives users the ability to make bulk code changes by leveraging universal code search and then programmatically defining changes by creating a declarative specification file. Sourcegraph also tracks changes from creation to merge and is displayed in their UI. Common use cases include refactors, migrations, and security modifications.
Source: Sourcegraph
Code Insights
Code Insights is an analytics product that enables engineering teams to transform their codebase into a queryable database and create custom, visual dashboards directly within Sourcegraph. The goal is to enable more data-driven communications and decisions so that developers can track projects such as migrations and adoptions without the need to separately track them with manual processes, get insights on an outage, and inform which key workflows should be automated to reduce repetitive work. This product is particularly useful for engineering leadership, because it lets leaders easily track ownership and trends, set goals, and celebrate progress.
Source: Sourcegraph
Cloud
Early in Sourcegraph’s history, the company offered a cloud service that would authorize a developer’s code to Sourcegraph and enable code search. However, it wasn’t initially a successful offering. Developers didn’t trust Sourcegraph with their code, a problem that was especially acute for large companies that would benefit most from code search. Sourcegraph, therefore, changed course to provide a self-hosted product instead.
Fast forward eight years to September 2022. Having onboarded major customers like Uber and Plaid, Sourcegraph decided it was time to re-launch its cloud offering with a clear emphasis on security (SOC 2 Type II Compliant) and scalability, this time with greater success. Sourcegraph's product features are now the same whether used via its cloud offering or self-hosted.
Market
Customer
Sourcegraph’s ideal customer profile includes developers at large-scale engineering organizations with multiple engineering teams, repositories, and languages. Companies such as Uber, Lyft, Plaid, and Dropbox (all Sourcegraph customers) not only have extensive codebases but also benefit from tools such as batch changes for managing large-scale migrations and refactors.
While the market for developers alone is sizable, Sourcegraph has the potential to be valuable for more than just developers. Product Managers, Data Scientists, Designers and even tech-inclined Operations teams can get value from understanding their products' code and dependencies, and saving engineering time by explaining information that can be made self-discoverable.
Market Size
The global DevOps market was valued at $6.7 billion in 2020 and is projected to reach $57.9 billion by 2030, representing a 24.2% CAGR for the decade. If Soucegraph were to expand into the broader software development tools market, its addressable market could grow. The market for software development tools is projected to be worth $733.5 billion by 2028, with the small and medium enterprise segment growing the fastest at an expected CAGR of 25.0% from 2021 to 2028.
Competition
Notable competing products in the source code search space include Google Code Search, OpenGrok, Hound and Koders.
Google Code Search was launched in 2006 as an internal beta product for Google developers and allowed web users to search for open-source code on the internet. The site allowed the use of regular expressions in queries. Notably, the use of Google’s Code Search inspired Sourcegraph's co-founder and CTO Beyang Liu to bring a similar product to all developers. Google discontinued Code Search in 2012.
Oracle’s open-source product OpenGrok offers a source code search and cross-reference engine. Written in Java, developers can leverage OpenGrok to search for full text, definitions, symbols, and path and revision history. OpenGrok incrementally updates its index and allows for search queries with Google-like syntax, allowing users to search for files modified within a date range and more. It also has a read-only interface for version control systems that provides a history log of a file, diffs between any two revisions, and a cumulative history of a given directory. OpenGrok was originally conceived by Chandan B.N. at Sun Microsystems in 2005.
Hound was created in 2015 by Kelly Norton and Jonathan Klein, developers at Etsy. While Norton and Klein had used code search tools before, their chief complaint was that they were too slow, too hard to configure, or required too much software to install. Hound enables source code search with a React frontend and Go backend and prides itself on the minimum requirements needed to run Hound: Go 1.13+. Leveraging Go, the backend keeps an up-to-date index for each repository and answers searches through a minimal API.
Finally, Koders launched in 2000 as a search engine for code search and was acquired by Black Duck Software (now Synopsys) in 2008.
At face value, most of these competitors offer a similar core feature to Sourcegraph: namely, code search. Given the open-source nature of Google’s Code Search, Oracle’s OpenGrok, and Etsy’s Hound, it appears that none have capitalized on the opportunity to make code search a standalone business.
Business Model
Sourcegraph has three pricing tiers:
Free: A limited version of Sourcegraph is available for free for small teams, supporting only a single code host integration that must be self-hosted.
Business: Sourcegraph’s business plan costs $99 per active user per month, with an active user defined as any user who accesses Sourcegraph in a given month. The plan includes code search, batch changes, code insights, and cloud.
Enterprise: Sourcegraph’s enterprise plan offers self-hosted code hosts, Git-based code hosts, private code hosts, over 75GB of cloud code storage and four executors.
Traction
In 2021, Sourcegraph reached $10 million in revenue and over 800K users. As of 2023, over 1.8 million engineers use Sourcegraph. Its customers include companies such as Uber, Redfin, Plaid, Lyft, Databricks, and Reddit.
Source: Sourcegraph
Valuation
In less than 1.5 years, Sourcegraph went from raising a $20 million Series A to having raised a total of $250 million in funding and at a multi-billion dollar valuation. In July 2021, Sourcegraph raised a $125 million Series D at a $2.6 billion valuation, representing a ~260x revenue multiple. The round was led by Andreessen Horowitz with participation from Insight Partners, Geodesic Capital and existing investors, demonstrating the belief in developer productivity and satisfaction being critical to a company’s success. In December 2020, Sourcegraph raised a $50 million Series C led by Sequoia Capital. In March 2020, Sourcegraph raised $23 million from Craft Ventures.
Key Opportunities
Growing Beyond Developers
While the ambition to bring universal code search to every team and developer is no small aspiration, there’s an opportunity to expand Sourcegraph’s user base beyond more than just developers at a company. By positioning Sourcegraph as a valuable tool for other functions such as product managers, data scientists, designers, and operations managers, Sourcegraph could expand its use cases. Sourcegraph allows these other functions to understand code dependencies in a self-serve manner.
Targeting Startups and Individual Coders
The benefits of universal code search, batch changes and code insights are clear for large companies where multiple repositories and disparate engineering teams present code-base-wide challenges. To expand beyond this, Sourcegraph could highlight its access to millions of public repositories as a learning tool for small developer teams and individual coders. Taking it one step further, Sourcegraph could provide perspective on best practices, similar to a product like Stack Overflow.
Key Risks
Security
Linking a developer or companies entire codebase with Sourcegraph could create security concerns for customers. This was a core concern with Sourcegraph Cloud in 2013. Ultimately, that concern led the company to build self-hosted product instead for the first nine years of its life. Today Sourcegraph has implemented security best practices such as Access Monitoring, Business Continuity and Disaster Recovery Plans, Endpoint Security, and more — as outlined on their security page. However, a single security incident can undermine developers trust with Sourcegraph and set the company back significantly.
Macroeconomic Headwinds
Given the macroeconomic headwinds in 2023, companies have been looking to consolidate vendors. Sourcegraph could be among the services that companies decide to cut, since it costs around $1K per user per year (Sourcegraph’s business plan charged $99 per user per month). This could be particularly likely at small and mid-sized companies that view their codebase as manageable without code search and batch changes.
Feature vs. Company
Github recently announced hitting the milestone of 100 million active users, with many engineers and companies using Github to house its repositories. With so many engineers and companies already using Github, enabling a code search feature through Github isn’t far from being a possibility, and this could endanger Soucegraph’s growth.
Summary
Developer productivity is becoming a high priority for growing companies. Managing batch changes across multiple repositories, such as handling migrations, is a common use case across developers at companies of all sizes. Sourcegraph has found an opportunity to enable universal code search and batch changes to increase developer productivity, reaching $10 million in revenue in 2021, and enabling engineering teams at companies like Uber and Hashicorp. Sourcegraph can benefit from expanding its user base beyond developers to other functions and become a valuable resource for all developers, not just those with multiple repositories. Sourcegraph can also strengthen is position through expanding its user base.