Hive offers a suite of APIs that help companies to understand, search, moderate, and generate content. The company offers a portfolio of deep learning models built with training data sourced and annotated by a distributed workforce of contributors. These models classify content such as violence, pornography, and hate speech, thus automating content screening and detection. Hive's APIs enable use cases like contextual advertising, advertising measurement, document parsing, and more.

Founding Date

Jan 1, 2013


San Francisco, California

Total Funding

$ 121M


series d



Careers at Hive



June 22, 2023

Reading Time

14 min


The internet has evolved into an enormous interconnected sea of information, offering a seemingly infinite amount of content at the click of a button. Over 328.8 million terabytes of data are created on the internet each day. It has facilitated global conversation and offered new avenues for people to share ideas, news, opinions, and experiences. Every day, billions of people interact with each other through various platforms sharing texts, posting photos, videos, and more. The volume of content on the internet continues to grow as internet adoption and device usage continue increasing.

However, as the amount of content on the internet continues growing, so does the perceived importance of moderation to keep users safe. Content moderation is intended to enable online platform usage without fear of being exposed to harmful content, abuse, or harassment. Content moderation is intended to play role in maintaining the quality and safety of online spaces, and is used to filter out material that is seen as inappropriate or harmful and ensure that user-generated content complies with platform guidelines as well as legal requirements. However, the complexity and volume of content uploaded daily online make content moderation a challenging problem. Manual moderation is time-consuming and subject to human error. Additionally, monitoring content in real-time is a near-impossible task. The need for solutions is large and growing; in 2022 the global content moderation solution industry was valued at $10 billion.

Hive offers a suite of APIs that help companies to understand, search, moderate, and generate content. The company offers a portfolio of deep learning models built with training data sourced and annotated by a distributed workforce of contributors. These models classify content such as violence, pornography, and hate speech, thus automating content screening and detection. Hive's APIs enable use cases like contextual advertising, advertising measurement, document parsing, and more.

Weekly Newsletter

Subscribe to the Research Rundown

Founding Story

Hive was founded by Kevin Guo (CEO) and Dmitriy Karpman (CTO). Before founding Hive, Guo was an associate at Mithril Capital Management. He holds a bachelor's degree in mathematics and computational sciences and a master's degree in computer science from Stanford. Karpman earned his undergraduate degree from the University of Missouri-Columbia and his Ph.D. in computer science from Stanford. Karpman was a graduate research assistant at Stanford.

At Stanford prior to Hive, Guo and Karpman founded Kiwi, a Q&A-based social platform focusing on mobile. Starting out as a class project, Kiwi grew to have over 100 million users and billions of posts. Along the way, Kiwi encountered issues around content moderation and could not find AI models to help them, and the team began to build their own internal AI solution as a result. At the end of 2017, the team pivoted to focus on enterprise AI and Hive was founded, inspired by the models built internally for Kiwi.


Hive builds cloud-based AI models to help companies understand their content. Hive offers APIs for three categories: understand, search, and generate. These APIs are built on data collected through its distributed micro-task platform, Hive Micro. Hive also offers content moderation tools as well as solutions for advertising and sponsorships.


Hive offers a suite of AI models through its APIs that help companies understand, classify, and moderate their content. Hive’s visual, text, and audio moderation tools provide real-time screening of images, videos, GIFs, text, audio, and live streams, flagging content that may breach the client’s predetermined guidelines. Its demographic API identifies key demographic attributes in visual content, and its logo & logo location API can recognize thousands of logos and their common placements within videos and images.

Hive also offers content APIs designed to understand the context of content. Hive’s visual context API identifies common objects, settings, and events within visual content. Meanwhile, Hive’s optical character recognition (OCR) tool can extract text and emojis in over 15 languages from videos and images. Its speech-to-text API transcribes audio content across several languages, and its translation API provides real-time text translation. Hive also offers an AI-generated media API that enables clients to understand, identify, and moderate AI-created artwork, photos, and memes.

Hive's content moderation APIs are used by companies to automatically screen and detect media content, such as images and videos, identifying potentially harmful content on their platforms. These APIs utilize machine learning models to classify content into categories such as violence, pornography, and hate speech. Once the content has been classified, it can be automatically removed or flagged for further human review. It's important to note that, while Hive provides the tools for content moderation, it does not take a stance on moderation. The parameters of moderation are usually determined by the client using Hive's services.


Hive’s search APIs help customers find specific content or variants of that content. Through its AI models, it can sift through web content and custom data sets, analyze visual similarities, and perform large-scale searches across image and video databases. Companies can use Hive’s search APIs to identify duplicate images across a platform, flag copyrighted-protected material, and enable site-wide text-to-image search.

Search APIs include Hive’s web search API and custom search API, which work to identify duplicates and visually similar images within a platform or a customized media library. For copyright protection, Hive’s copyright search API can flag copyrighted-protected material to enforce intellectual property rights, and its NFT search API verifies the uniqueness of an NFT asset against millions of assets on major blockchains. Lastly, Hive’s contextual search API can perform natural language searches on large image sets, enabling site-wide text-to-image search capabilities.

Source: Hive


Through its generate APIs, Hive offers deep learning models to generate images and text from text prompts. Among its generative APIs is its image and video generation APIs, which can produce images and videos from text prompts. These models are trained on a vast quantity of images and videos and produce a photorealistic output, while ensuring data moderation to prevent the creation of explicit or violent media. Furthermore, Hive offers an image captioning API which provides detailed image descriptions, and a text generation API which can produce human-linked text. These APIs can be used in a variety of ways, from marketing to customer service.

Source: Hive

Data Labelling

Hive offers a fully-managed data collection and annotation service for companies that need large volumes of labeled data. These services include video & image annotation, which allows for the labeling and categorization of visual data, text & document annotation, where structured text is extracted from documents and textual data, audio annotation for annotating audio clips, 3D Point Cloud Annotation for labeling lidar and other 3D data, and data sourcing, which provides on-demand datasets built for client needs.


Hive provides a suite of end-to-end software applications built to complement its APIs.

Moderation Dashboard: The moderation dashboard is built on top of Hive’s content moderation APIs. It helps companies manage and moderate content on its platform. It enables companies to set up custom actions and rules to implement into their content moderation policies without processing Hive’s model responses directly.

Source: Hive

AutoML: Hive’s AutoML platform allows customers to train, evaluate, and deeply customize machine learning models. By downloading a dataset of images, videos, and more, customers can use the AutoML platform to guide them to create a fully-functioning model accessible through API and served through Hive’s data. Models that are built with AutoML can have up to 20 classification categories.

Source: Hive

Sponsorship Intelligence: Hive offers Mensio, which delivers comprehensive, near real-time analytics on sponsorships and branded content. Mensio allows users to access and analyze data from more than 8K brands across programming on national TV, regional sports networks, as well as digital and social channels. It allows customers to explore brand and team-centric modules, customize dashboards and reports, and extract data with one-click exports. Additionally, it provides access to brand exposure data across all TV programming. Near-real-time processing is available for priority reports.

Source: Hive

Ad Intelligence: Hive’s ad intelligence offering provides next-day, cross-platform ad intelligence for a comprehensive view and measure of ad spend. Hive delivers insights about a company's advertising activities and expenditures across multiple platforms, including social media, TV, and other digital channels.

Source: Hive

Context-Based Ad Targeting: Hive allows publishers and platforms to monetize their ad inventory. By leveraging its AI models, Hive can generate contextually-relevant labels from video and audio content. These labels aim to assist in delivering advertisements that align with the surrounding content while enhancing the user experience and improving ad effectiveness.

Source: Hive



Hive serves a variety of customers across diverse use cases. For example, Hive helps ecommerce platforms and marketplaces screen and verify listings and keep user interactions positive. It also helps dating platforms flag inappropriate profile content, maintain age requirements, and identify bots, helps social and gaming platforms by moderating user content and interactions, and helps brands measure cross-platform advertising. In addition, it helps publishers monetize ad & sponsorship inventory. Notable Hive customers include Giphy, BeReal, Reddit, Walmart, and Comscore.

Market Size

Hive operates in multiple markets. While its primary focus is on content moderation tools, it also offers solutions for advertisers in the media industry.

In 2022, the global content moderation solution industry was worth $10 billion. The industry is projected to reach a size of $26 billion in 2023, growing at a CAGR of 10.25%. The content moderation industry is experiencing increasing demand as the internet becomes increasingly integral to people’s daily lives, and content volume continues to rise as a result. In addition, there are growing societal and regulatory pressures to combat harmful content, such as hate speech, violence, and explicit material, driving platforms to invest more heavily in effective content moderation solutions.

In 2021, AI in the media & entertainment market was valued at $10.8 billion. Fueled by technological development in AI and increasing adoption amongst advertising companies, this industry is projected to grow at a CAGR of 26.9% from 2022-2030.


Clarifai: Founded in 2013 in New York City, Clarifai is an AI company specializing in computer vision, natural language processing, and automatic speech recognition. It offers APIs for visual recognition, language understanding, and audio transcription. These APIs provide machine-learning capabilities to apps, websites, and devices. Like Hive, Clarifai’s AI models can be used for content moderation, brand safety, visual search, and more. Its solutions are used in a range of industries, including retail, media, advertising, and defense. As of May 2023, Clarifai has raised a total of $100 million and is at the Series C stage.

Google Cloud: Google Cloud, a Google division launched in 2008, offers a wide range of AI products that compete with Hive. Google’s suite of AutoML products offers solutions in image analysis and text analysis, which are areas that Hive also targets. Furthermore, Google Cloud's set of APIs for AI services offer solutions for classifying images, understanding and analyzing text, converting audio to text, and providing real-time translation between languages — areas where Hive also has offerings. In 2022, Google reported total revenue of $26.3 billion for its Google Cloud division, and in Q1 of 2023, it announced that Google Cloud turned a profit for the first time.

Amazon Web Services: Launched in 2006, Amazon Web Services, a cloud computing platform provided by Amazon, provides a comprehensive suite of machine learning and AI services to individuals, companies, and governments. Its offerings overlap with Hive in the areas of natural language understanding, speech-to-text transcription, and image and video analysis. AWS is a generalist cloud provider and offers a broad range of services beyond AI and machine learning. On the other hand, Hive AI has a much more focused approach in the AI domain. In 2022, Amazon Web Services had revenues over $80 billion with a net income totaling $22.8 billion.

Microsoft Azure: Launched in 2008, Microsoft Azure, Microsoft's cloud computing platform, also offers a suite of services in AI and machine learning. Azure's offerings align with Hive AI in areas such as image and video analysis, natural language processing, and speech recognition. Like AWS and Google Cloud, Azure also provides a more general AI service which is a cloud-based environment that allows developers to train, deploy, automate, manage, and track ML models. While Hive AI focuses primarily on image and video recognition and data labeling services, Azure's AI capabilities are more diverse, extending into areas like predictive analytics, anomaly detection, and machine learning model management. In 2022, Azure was the fastest-growing segment of Microsoft.

Business Model

Hive generates revenue by selling access to its APIs to companies that want to understand, analyze, moderate, and generate content. Hive has little publicly available information on pricing and likely has customized contracts for each client.

Companies who buy access to Hive's APIs can then integrate Hive’s pre-trained models into their existing products to automate tasks previously done manually or non-scaleable such as interpreting images, videos, text, and audio data.

Central to Hive’s business is its data labeling process. Through its Hive Micro (formerly Hive Data) platform, Hive has built a community of over 5 million registered individuals around the globe that source, label, and annotate data, classify images, and transcribe audio for small amounts of income. Once accurately evaluated and labeled, the data become resources to train and refine Hive’s AI models. Hive’s portfolio of models has been trained on over 1 billion pieces of human-annotated data, more than any publicly available data set, which has led to what Hive claims is unparalleled accuracy.


In April 2021, Hive announced that it had 100 enterprise customers, including NBCUniversal, Interpublic Group, Walmart, Visa, Anheuser-Busch InBev, and more. It also said it grew its customer base and revenue by over 300% in the prior year. As of June 2023, Hive has had more than 5 million registered workers on its labeling platform. The distributed workforce labels over 10 million items daily. Hive’s notable customers as of 2023 include Reddit, Walmart, BeReal, Visa, Vevo, Quora, Disney, Zynga, and more.

Source: Hive


In April 2021, Hive announced a $50 million Series D which valued the company at $2 billion. Since its inception, Hive has raised $120 million in total funding. The financing for Hive draws from a diverse pool of investors ranging from VCs like General Catalyst to Bain & Company to Visa Ventures. Other notable investors include 8VC, Gylnn Capital Management, Founders Fund, Tomales Bay Capital, and Jericho Capital.

Source: Hive

Key Opportunities

Increasing Demand for AI Tools

As platforms like social media, video sharing, e-commerce, and others continue to grow and become more prevalent, the volume of content to be moderated will increase, and so will the demand for Hive's content APIs and moderation tools. On the advertising front, AI is changing the industry by making ads more targeted and personalized. Hive's contextual advertising models, which understand the context of a page to serve relevant ads, and its advertising measurement models, which assess the performance of ad campaigns, can offer advertisers an edge in the highly competitive digital advertising space. As more businesses shift their advertising budgets to digital platforms and seek to maximize their ROI, Hive's AI solutions for advertising could see increased adoption.


As of 2021, most of the language learning in Hive’s AI models were based on English and other popular languages around the globe, such as Spanish and French. However, there is a need for AI services that can process and understand a wider array of languages. Developing machine learning models that can accurately understand, analyze, and generate content in these languages would make Hive's services more accessible and useful to a wider customer base. This could also lead to a broader set of use cases for the data and technology that Hive has built up.

Weekly Newsletter

Subscribe to the Research Rundown

Key Risks

Data Risk

Hive’s AI models are built using training data sourced and annotated by a distributed workforce. Data quality can significantly influence the accuracy and effectiveness of the AI models. Any issues that threaten this data acquisition and labeling process, such as Hive encountering difficulties in maintaining a large and active community of contributors, issues arising with the quality of data provided by the contributors, could threaten Hive’s ability to maintain and improve its AI solutions.


Hive is an AI company that builds cloud-based AI models to help companies understand their content. Hive offers APIs among three categories, understand, search, and generate, built on data collected through its distributed micro-task platform. It offers content moderation tools as well as solutions for advertising and sponsorships.

Disclosure: Nothing presented within this article is intended to constitute legal, business, investment or tax advice, and under no circumstances should any information provided herein be used or considered as an offer to sell or a solicitation of an offer to buy an interest in any investment fund managed by Contrary LLC (“Contrary”) nor does such information constitute an offer to provide investment advisory services. Information provided reflects Contrary’s views as of a time, whereby such views are subject to change at any point and Contrary shall not be obligated to provide notice of any change. Companies mentioned in this article may be a representative sample of portfolio companies in which Contrary has invested in which the author believes such companies fit the objective criteria stated in commentary, which do not reflect all investments made by Contrary. No assumptions should be made that investments listed above were or will be profitable. Due to various risks and uncertainties, actual events, results or the actual experience may differ materially from those reflected or contemplated in these statements. Nothing contained in this article may be relied upon as a guarantee or assurance as to the future success of any particular company. Past performance is not indicative of future results. A list of investments made by Contrary (excluding investments for which the issuer has not provided permission for Contrary to disclose publicly, Fund of Fund investments and investments in which total invested capital is no more than $50,000) is available at

Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by Contrary. While taken from sources believed to be reliable, Contrary has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Please see for additional important information.


Carter Wang


See articles

© 2024 Contrary Research · All rights reserved

Privacy Policy