Report: AI21 Labs Business Breakdown & Founding Story

Thesis

Enterprises are accelerating their adoption of AI. According to McKinsey’s 2025 global survey, 92% of companies plan to increase their AI investments over the next three years. Yet, only 1% of company leaders consider their organizations “mature” in AI integration.

Despite growing demand, many enterprises remain cautious due to persistent pain points: large language models frequently hallucinate, and low-accuracy outputs pose significant risks in high-stakes industries such as finance, healthcare, and law. According to a Stanford 2024 study, in the legal field, hallucination rates range from 69% to 88% and model performance further deteriorates when handling complex tasks. Another 2024 study showed that GPT-3.5 and GPT-4 still hallucinate at rates of 40% and 29% respectively. Achieving hallucination rates below 5%, critical for high-stakes industries, remains challenging and often comes with trade-offs with generalization and efficiency. These concerns have made reliability, transparency, and control top priorities for enterprise buyers.

While general-purpose players like OpenAI and Anthropic have taken steps to address these issues, AI21 Labs is pursuing a more focused approach. The Israeli company targets primarily enterprise customers, offering open-weight foundational models optimized for long-context processing and low latency. Maestro, its agentic orchestration system launched in March 2025, emphasizes reliability, traceability, and workflow observability. Rather than competing for broad adoption, AI21 aims to serve a narrow, high-reliability segment by positioning itself as an accurate and trustworthy model.

Founding Story

AI21 Labs was founded in 2017 in Israel by three entrepreneurs and academics: Amnon Shashua (Chairman & Co-Founder), Yoav Shoham (Co-CEO & Co-Founder), and Ori Goshen (Co-CEO & Co-Founder).

As of August 2025, Shashua is a computer science professor at the Hebrew University of Jerusalem, having published over 160 papers and filed more than 94 patents in the field of machine learning. Before founding AI21, he co-founded Mobileye, an autonomous driving company whose technology is deployed in over 125 million vehicles worldwide. In 2014, Mobileye set the record for the largest-ever Israeli IPO in US history, as of August 2025. Then, in 2017, it broke another record for the largest acquisition of an Israeli company when Intel acquired it for $15.3 billion. A serial founder, Shashua also created the assistive technology company OrCam, the OneZero digital bank, which is also a customer of AI21 Labs, and is working on both AI21 Labs and the AI robotics company Mentee Robotics as of August 2025.

Similarly, Shoham is a professor emeritus of computer science at Stanford as of August 2025 and is a former principal scientist at Google. He became interested in AI after attending Yale, where he graduated with a PhD in computer science. As a professor at Stanford, he authored papers and textbooks on agent-oriented programming, game theory, and multi-agent systems. Like Shashua, Shoham has also founded several AI companies, including TradingDynamics (acquired by Ariba), Timeful (acquired by Google), and Katango (acquired by Google).

Goshen, much like Shashua and Soham, is also a seasoned entrepreneur. Growing up as the youngest in a family of electrical engineers, he started programming at the age of four. At 13 years old, he spent a summer fixing computers, and by 16, he started his own business building websites. When he was drafted to the Israeli Defense Force, he joined their 8200 tech unit and began programming professionally. In 2007, he joined Fring, a mobile communication startup that later amassed 40 million users. In 2009, he founded the telecommunications analytics firm Crowdx, which served customers like AT&T and Verizon before eventually being acquired.

After Crowdx’s acquisition, Goshen began considering his next move. Through his wife, who led a programming bootcamp with Shoham, the two were introduced. Shortly after, Shashua joined them, and the three created AI21 Labs. Goshen said,

“We each have specific focus areas. Yoav and Amnon are scientists and I’m not, so they are more deeply engaged in the scientific part of the business. But we are all equally involved in the company. It’s amazing to learn from their experience and their understanding of how to build great technology companies.”

When AI21 was first founded in 2017, many AI scholars were focused on computer vision and deep learning. But for AI21’s three cofounders, large language models were the future. Goshen still believes in 2025 that “deep learning is necessary but not sufficient.” AI21 became one of the first companies to train large language models, and it emerged out of stealth in October 2020 to release its first product, Wordtune, a writing companion tool.

Product

AI21 Labs is an applied research lab and generative AI company with two main product lines: Wordtune for consumers (B2C) and foundational large language models for enterprise clients (B2B).

Wordtune

Created in 2020, Wordtune is a freemium AI writing editor. Its algorithm produces context-based rewrites and expansions according to the user’s style. The application, which supports 10 different languages, has provided its 10 million users worldwide with more than 782 million rewrite suggestions as of August 2025. Its companion tool, Wordtune Read, uses AI to help summarize documents.

According to Shoham, however, Wordtune was never meant to be AI21 Lab’s core business. Though Wordtune still exists on the market as of August 2025, the company transitioned its focus to enterprise applications in 2021.

Source: Wordtune

Foundational Models for Enterprise

AI21’s second product line features foundational AI models designed for enterprises rather than individual consumers. The company builds, trains, and delivers customized large language models (LLMs) that can accurately answer questions, chat with users, generate text, and summarize lengthy documents. Its models are known for their low cost, long context windows, fast inference speed, and open weights.

While most pure transformer models, such as GPT-4, Claude 4, and Gemini, are closed weights, AI21 hopes that their open weight models will build customer trust and enable users to experiment and push the technology’s boundaries.

As of August 2025, the company has four major AI model families and products: the outdated Jurassic family, the new Jamba Series, AI21 Studio, and the Maestro agentic planning system.

Jamba Series: Mamba Transformer MoE Model

In 2021, AI21 Labs introduced its first foundational mode, Jurassic-1. The model was one of the largest LLMs in the market at the time, with capabilities that surpassed OpenAI’s GPT-3 DaVinci. In March 2024, AI21 improved upon Jurassic’s designs and released its Jamba family models.

The Jamba models can process large amounts of information due to their unique Transformer-Mamba Mixture of Experts (MoE) architecture. While the transformer, the standard LLM model found at most companies, including OpenAI, Anthropic, and Meta, maximizes for parallel processing and expressivity, it is computationally expensive to scale for large datasets. The Mamba model (introduced in a December 2023 paper) addresses the transformer’s limitations by optimizing for larger inputs. As a result, AI21’s Jamba series is one of the few open-weight models besides Mistral to combine both the Mamba and Transformer architectures.

Source: Silicon Angle, AI21 Labs

The Jamba 1.5 and 1.6 models are 10 times more memory efficient than Llama-3 or Mistral. They also boast 30% lower compute requirements, a lower memory footprint, and the highest context length in open weight models (256K tokens, compared to 32K-128K for Llama-3 and Mistral).

Jamba also outperforms Mistral, Llama, and Cohere on quality standards, Retrieval Augmented Generation (RAG), and long-context question answering. On standard LLM benchmarks, it matches or exceeds pure transformer models and is faster than popular open and closed-weight models.

Source: AI21

Jamba Large and Mini outperform Mistral Large 2, Llama 3.3 in general benchmarks (Arena Hard), RAG benchmarks (CRAG, Helmet CRAG), and long-context QA (FinanceBench, HELMET LongQA, LongBench).

Source: AI21

Jamba 1.6 models also outperform Mistral and Llama on quality (based on Arena Hard scores) and speed (output tokens per second).

Source: Silicon Angle, AI21

In practice, Jamba’s long-context window means that it can pass up to 800 pages of text documents in one inference call, which is similar in length to an individual company’s 10K filings, 32 earnings call transcripts, or 25 one-hour clinical interviews.

This capability allows companies to analyze large documents like contracts, financial reports, customer histories, and internal wikis. Jamba can also help legal teams review M&A documents and support financial analysts with risk reports, regulations, and earnings calls. As of August 2025, the newest Jamba 1.7 model is currently available on HuggingFace.

Maestro: Agentic Planning and Orchestration

In addition to its Jamba series, AI21 Labs also recently debuted Maestro, its AI planning and orchestration system. Or, as Goshen described it, an “AI for AI Systems” tool. The goal of Maestro is to reduce unreliable outputs and stop “prompting and praying” that models produce trustworthy results.

In a July 2025 article in the MIT Technology Review, Shoham wrote,

“In enterprise settings, this kind of mistake could create immense damage. We need to stop treating LLMs as standalone products and start building complete systems around them — systems that account for uncertainty, monitor outputs, manage costs, and layer in guardrails for safety and accuracy.”

Maestro wraps LLMs in a structured architecture that uses company data, public information, and external tools to boost accuracy. The architecture is designed to have a human in the loop who reviews plans and provides feedback. Users specify a set of instructions and requirements (e.g., content, style, format, genre, computing power budget, tools) in natural language, and Maestro breaks the tasks into steps, validating each step with the requirements, and adjusting its actions based on available compute. The final output includes the result, a score measuring how well each result matches the requirement, and any information on requirements that are not met.

The Maestro agent provides high visibility and control and can handle workflow automation such as user onboarding, enterprise deep research, and complex document analysis.

Financial firms can use Maestro to automate compliance monitoring, loan application processing, and earnings call summaries.
Tech companies can use it to improve RFI response automation, customer support assistant, documentation, and question answering.
Hospitals can manage their intake workflows, medical history data, consent forms, and insurance data.
Legal firms can deploy it for M&A due diligence, vendor risk assessment, and eDiscovery.
Manufacturing factories can use it for equipment troubleshooting, legacy system data migration.

While AI21 also builds its own foundational models, Maestro is model-agnostic, meaning users can choose which model they want to deploy. Based on internal evaluations, Maestro claims to improve on GPT-4o and Claude Sonnet 3.5’s accuracy by up to 50%, and enables reasoning models such as o3-mini to have a 95% accuracy.

As of August 2025, all enterprise AI21 customers have access to Maestro. In March 2025, the company also hinted at the launch of a new expressive language (a subset of Python) for describing plans and increasing agent controllability.

Source: AI21

Maestro improves upon GPT-4o, Claude Sonnet 3.5, and o3 mini on benchmarks including IFEval and FRAMES.

Source: AI21

Market

Customer

While AI21’s initial product, Wordtune, targeted individual consumers, the company’s core foundational models now cater to enterprise customers. Specifically, AI21’s ideal customers are companies in industries with low tolerance for errors and some prior experience with AI who are hoping to optimize performance. These customers usually run average-sized production workloads and tend to prioritize latency and accuracy over basic adoptions. Shoham said,

"We’re talking about almost zero errors. At the organizations using Jamba, we’ve come across almost nothing bizarre… For example, we work with One Zero, a digital bank. They have an AI-based customer service system and want to provide reliable automated responses to customers. Think of questions like, 'What is my checking account balance?' or 'What is the best investment option for me?'”

The company says it serves many Fortune 100 companies and its customers span multiple countries and industries including software (Capgemini, Lightricks, Jenni, Tweet Hunter), entertainment (Ubisoft, Lattitude), data analytics (Clarivate), fintech (One Zero, Rapyd, Tipalti), nonprofit (Harambee Youth Employment Accelerator), edtech (Educa Edtech), hospitality (Easyway), HR (Glassdoor), and retail (Fnac).

The French retailer Auchan deployed AI21’s systems to generate product descriptions from images. The multinational retail chain Fnac used the Jamba 1.6 for inventory data classification and saw a 26% improvement in output quality and a 40% decrease in latency. A digital banking company found that the Jamba Mini 1.6 model scored 21% higher on precision compared to its predecessors, matching OpenAI’s GPT-4o.

Many customers also mix and match models. An executive of Latitude, an AI-generated game company, said that they were originally among OpenAI’s first customers but weren’t thrilled about the product due to misaligned content. Now, they use AI21 Labs for most of their premium models and have been largely satisfied with its performance, citing the ability to fine-tune specialized models and responsive customer service. He said,

“I think AI21’s models are pretty good... They’ve made improvements over time that helped them reduce the cost and increase performance and lots of things like that… AI21 has been a great partner to us.”

However, he isn’t convinced that AI21 has a defensible moat. He says that its models are still “lagging behind OpenAI” and that “ChatGPT is winning by a wide margin in the preference data.” For them, OpenAI is also “significantly cheaper.”

A director at Samsung said that they were considering AI21, Cohere, and OpenAI and found AI21 to be relatively easy to implement and flexible for customization. After testing AI21’s models for e-commerce search, automated metadata extraction, and classification for TV and media content, the director said they were impressed by its adaptability. He said,

“Besides Hugging Face, we have been looking at AI21. We have certain models that are of interest to us because they provide the ability in the form of accessing those models in the form of APIs. So we can just extend our ecosystem to basically have some of these API calls to obtain the results that we really require.”

Market Size

The TAM for generative AI foundational models and platforms was estimated to be around $11 billion in 2024 by IoT Analytics. By 2030, the TAM is forecasted to reach $53.4 billion, reflecting a CAGR of 40.7% from 2023 to 2030. The enterprise AI market was valued at $97.2 billion in 2025 and is forecast to reach between $155.2 billion and $229.3 billion in 2030, representing an 18.9% CAGR for the most optimistic projections.

As of August 2025, AI is a core part of enterprise strategy, with 41% of companies actively researching AI, and 15% deploying it. For enterprises utilizing AI tools, 65% of them will rely on external vendors such as AI21 for collaboration. These companies are increasingly budgeting more for AI models and apps.

According to a McKinsey 2025 report, over the next three years, 92% of companies plan to increase their AI investments. However, only 1% of company leaders say that their companies are “mature” in integrating and deploying AI. Ramp’s AI Index, which uses credit card bills to create estimated data, estimates that 42% of US businesses currently have a paid subscription to AI models, platforms, and tools.

Source: Ramp

AI21 sits at the intersection of generative AI and enterprise AI use cases. While the market share for AI21 is limited as of August 2025, estimated to be around 0.01% by 6sense, AI21 Labs cofounder Goshen said that in 2025, many companies haven’t even scratched the surface of AI adoption. Goshen said,

“You go to a typical enterprise, and you see that they have tens of use cases in production — a good enterprise. But there are thousands in the backlog. The main reason is reliability.”

Competition

Competitive Landscape

While AI21 sells directly to businesses, many of its biggest competitors, like OpenAI, Anthropic, Meta, and Google, serve both individual consumers and enterprises. As of August 2025, AI21 lags these companies in general adoption.

According to a16z’s 2025 AI enterprise report, OpenAI has maintained overall market share leadership, while Google and Anthropic made considerable strides over the last year. GPT-4o is the model most deployed to production, and Gemini models have the best-in-class context windows. Anthropic has seen the most adoption in tech-forward companies, such as software firms and startups. At the same time, open source models like Llama and Mistral are adopted at higher rates at larger enterprises due to their on-premise solutions, data security and compliance emphasis, and fine-tuning ability.

Source: a16z

A Reddit user tested out Claude Console, OpenAI Playground, and AI21 Studio in 2025 and found that Jamba Mini 1.6 was especially stable across long inputs and offered built-in support. Another user said they used both Jamba and Claude, with Jamba handling most of the structured tasks while Claude provides softer and more nuanced reasoning.

A director at Samsung also said that AI21 was on par with Cohere in terms of customization and sometimes outperformed Cohere in Samsung’s NLP tests. However, Cohere was better for non-text and mixed data.

As of August 2025, AI21’s Jamba models currently rank #107 and #141 on LM Arena, an external benchmark website, while Google’s Gemini 2.5 pro and OpenAI o3’s models rank first and second. Even for longer queries, which AI21 claims to excel at, its models did not make the top ten leaderboard. In response to Jamba lagging in AI model rankings compared to competitors, AI21 Labs co-founder Shoham said in March 2025,

“We work with enterprises, and they don’t pay much attention to these rankings. We didn’t invest in an AI chatbot for the general public, which would have helped strengthen our branding, the way those companies did.”

Competitors

Foundational Models for Enterprises and Individual Consumers:

OpenAI: Founded in 2015 and based in San Francisco, OpenAI has raised $58 billion and is valued at $300 billion as of August 2025. It is best known for its AI models, including o1 and GPT-4 for text. OpenAI has attracted investments from Microsoft, a16z, Sequoia, and Tiger Global. It operates under a capped-profit model under its partnership with Microsoft and other investors. OpenAI announced the release of a new open-weight LLM in July 2025, marking its first open model release since GPT-2 in 2019. On July 17, 2025, it rolled out its new ChatGPT agent, an AI assistant capable of executing multi-step tasks through a virtual machine. The company serves both individual customers and enterprises, including Boston Consulting Group, Bain & Company, Estee Lauder, Moderna, Amgen, Gates Foundation, JetBlue, Lowe’s, Notion, and Booking.com. OpenAI Enterprise integrates into internal sources like Google Drive, SharePoint, Dropbox, and GitHub for more personalized answers. It also includes data analysis tools, collaborative canvases, project management, and advanced reasoning agents.

Anthropic: Founded in 2021 and headquartered in San Francisco, Anthropic has raised $18.2 billion in total funding with a valuation of $61.5 billion as of August 2025. Major investors include Amazon, Google, SAP, and Salesforce. The company’s primary model, Claude 4, launched in May 2025, includes Opus 4 and Sonnet 4 and supports hybrid reasoning, tool use, memory, and long workflows. Its Sonnet models are particularly effective for coding. The company brands itself as a leader in AI alignment and safety. While OpenAI has a meaningful consumer business with its ChatGPT app, Anthropic drives the majority of its revenue through APIs. The company has also introduced a slew of enterprise ecosystems, including Claude for Enterprise and Claude for Financial Services. In July 2025, Anthropic won a $200 million Department of Defense contract, and it has also deployed its models in federal U.S. agencies through Palantir and AWS.

Google DeepMind: Founded in 2011 and based in London, UK, DeepMind is a subsidiary of Alphabet. Its mission is to build the next generation of responsible AI that pushes toward Artificial General Intelligence. The company is known for its innovations such as AlphaFold (protein structure prediction), WaveNet (text-to-speech), GATO (multimodal LLM), Gemini Series, Veo-3 (video generation), and Lyria (music). As of August 2025, DeepMind’s efforts have primarily centered around its Gemini series of models. Many of Google’s services, such as data center cooling, Google Maps predictions, and Play Store recommendations, depend on DeepMind innovations. Beyond its AI offerings, DeepMind competes directly with Microsoft through its AI-enabled solutions across Google Workspace (Docs, Sheets, Chats, Meet). In July 2025, the company acquired the IP of Windsurf, an AI coding tool.

Mistral AI: Founded in 2023 and based in Paris, Mistral AI has distinguished itself by taking an open-source approach to AI development, focusing on smaller but highly efficient pure transformer models like Mistral 8B. The company’s models are most comparable to AI21’s Jamba Series, with a mixture of expert setups, high performance on reasoning, and fast inference times. With backing from a16z, Index Ventures, Lightspeed, and Redpoint, Mistral focuses on creating efficient, open-source models for developers, academics, and enterprises looking for alternatives to proprietary AI systems. As of August 2025, Mistral has raised $1 billion total at a $6 billion valuation. It's Le Chat Enterprise, launched in July 2025, integrates with enterprise systems like Google Drive, SharePoint, OneDrive, Gmail, and Calendar. Notable enterprise clients include BNP Paribas, Cisco, CloudFlare, Harvey, IBM, and HuggingFace.

Meta: Based in Menlo Park, CA, Meta has taken a deliberate position in open-source AI development with models like Llama 4 (text), Emu (image), and AudioCraft (audio). Meta has aligned itself with various cloud providers, including AWS, Google Cloud, and Microsoft Azure, to ensure broad accessibility to its models. In June 2025, Meta announced the creation of its Meta Superintelligence Lab with the acquisition of Scale AI. The Lab oversees its most ambitious projects and is in discussions to abandon its open-sourced Llama models in favor of closed ones.

Foundational Models for Mainly Enterprises:

Cohere: Founded in 2019 in Toronto, Canada, Cohere focuses on building enterprise-level LLMs tailored to specific customer requirements. Cohere emphasizes a cloud-agnostic approach, working across Google Cloud, Oracle, and AWS, enabling flexibility for businesses looking to integrate AI solutions. Its primary models, such as Command, focus on providing enterprise customers with customizable text generation capabilities for tasks like customer support, automation, and content generation. Its strategic partners include Oracle, Accenture, McKinsey, and RBC, among others. It is backed by Oracle, SAP, Tiger Global, and Salesforce. As of August 2025, Cohere had raised $1.6 billion with a valuation of $6.3 billion.

Aleph Alpha: Founded in 2019 in Germany, Aleph Alpha develops sovereign, explainable AI solutions for enterprises and government agencies. It emphasizes data security and compliance with European regulations. Though it was originally focused on foundational models, it pivoted to delivering an AI operating system and B2B consulting services across sensitive business and public-sector domains. While many US companies focus on scale and benchmark performance, Aleph Alpha centers itself around compliance, explainability, and control. In November 2023, it raised $500 million in Series B funding at an undisclosed valuation from investors including Innovation Park Artificial Intelligence, Bosch Ventures, and Schwarz Group.

Business Model

AI21 operates on a usage-based pricing business model, where enterprise customers can either pay-as-they-go or design a customized plan. To use its foundational models, customers can choose between Jamba Mini (optimized for lightweight texts) or Jamba Large (optimized for long-context processing). Pricing is determined by the number of input and output tokens used.

In natural language processing, a token corresponds to parts of a word, usually three to four characters in English. AI21 claims that in their models, the average token corresponds to one word (~six characters), allowing for 30% more text per token and 30% lower computing costs.

Source: AI21

For Jamba Mini, users pay $0.2 per 1 million input tokens and $0.4 per 1 million output tokens. For Jamba Large, users pay $2 per 1 million input tokens and $8 per 1 million output tokens.

Source: AI21

Traction

AI21 models are currently used by several Fortune 500 companies across multiple industries. The company has also announced partnerships with Snowflake, Microsoft Azure, Amazon BedRock, Google Cloud, and NVIDIA to increase its inference capabilities. Unverified estimates claim that AI21 generated between $57.8 million and $111.6 million in revenue in 2024.

AI21’s adoption still trails behind OpenAI, Google, Anthropic, Meta, Mistral, and Cohere. AI21 was absent from lists of foundational models featured by a16z, McKinsey, Lightspeed Ventures, and Menlo Ventures.

Valuation

AI21 Labs raised a $300 million Series D in May 2025, led by NVIDIA and Alphabet, bringing the total funding to $636 million as of August 2025. The company's valuation was last disclosed at $1.4 billion after its $155 million Series C round in 2023. Existing investors include Samsung Next Ventures, Google Israel, Coatue, Intel Capital, and 8VC.

Key Opportunities

Enterprise Adoption

As more enterprises adopt AI, many are looking towards on-premise and open-source solutions due to compliance concerns and the ability to fine-tune use cases, according to a16z. AI21 Labs satisfies these requirements for mature companies seeking more sophisticated AI software.

In addition to its on-prem and open-weight model, AI21 Labs offers customization and longer-context windows (which also allow users to prompt engineers at a lower cost without much fine-tuning). According to Lightspeed Venture Partners, enterprise customers use foundational models mainly for text summarization, internal employee-facing chatbots, and external-facing chatbots, realms that Maestro and Jamba provide extensive capabilities in.

Source: Lightspeed Venture Partners

While OpenAI and Anthropic still dominate the market, the enterprise AI landscape remains fragmented and multi-modal. For example, relative newcomer xAI’s Grok 4, launched in July 2025, has attracted significant attention, and the DeepSeek-R1 model announced in January 2025 is now being tested at some enterprises. Increasingly, companies are mixing and matching models, with 37% of respondents saying that they use five or more models, compared to 29% of companies last year.

Source: a16z

AI21’s strategy focuses on depth over breadth. Rather than targeting all enterprise customers, cofounder Goshen said that they want to form deep partnerships with anchor customers to help them scale over time.

Emphasis on Accuracy

Enterprises are embracing the integration of AI into operations, but accuracy remains a major hurdle. Lightspeed Venture Partners found that across all use cases, the main challenges with adoption are data privacy and performance quality. A 2024 Stanford study found that in the legal field, hallucination rates range from 69% to 88%. Performance also deteriorates when dealing with complex tasks that require a nuanced understanding of legal functions. Raw LLMs like ChatGPT often lack the reliability and explainability that enterprises often require, while Maestro seeks to solve these problems.

Source: Lightspeed Venture Partners

To address hallucination, firms like Perplexity, OpenAI Deep Research, and Exa have developed systems prioritizing accuracy and verifiability. AI21’s Maestro takes a step further by letting users specify explicit output requirements and track how well the outputs are met through observability logs. Each step is evaluated against strict benchmarks, and models are designed to draw only upon a company’s data to answer questions (and to respond with “I don’t know” rather than fabricate answers).

Rise of Agents

Deloitte predicts that 25% of enterprises using generative AI will deploy AI agents in 2025, with that number doubling to 50% by 2027. Existing agentic systems, including OpenAI’s new agent mode released in July 2025, are relatively new, experimental, and unpredictable.

AI21’s Maestro system seeks to address these uncertainties through a comprehensive, reliable agent orchestrator system for enterprises. Slated to release a new Python-based programming language, Maestro balances expressiveness with predictability, a key feature for multi-step agents. Importantly, Maestro is not confined to only AI21’s models. It fine-tunes well-performing models by OpenAI and Anthropic, improving GPT-4o’s accuracy from 85% to 91.9% and Claude Sonnet’s 3.5 from 88% to 95.2%. On the FRAMES benchmark, Maestro achieved a 75% accuracy rate (OpenAI’s Assistant API scored 69%). Maestro allows enterprises to use their favorite models while adding a layer of oversight.

Markets with Heightened Transparency Concerns

AI21’s major customers, including Fnac, Ubisoft, Lightricks, and Clarivate, are based primarily in Europe and Israel, regions where the regulatory landscape (e.g., the EU AI Act) emphasizes compliance, security, and accuracy. This aligns with AI21’s enterprise positioning around customizability, data control, and transparency.

The company is exploring growth opportunities in other regions with rising generative AI adoption. In Australia, where legislature recently passed significant privacy reforms and legislation on mandatory AI guardrails is expected to be introduced, AI21 recently met with 18 prospective enterprise clients, reflecting increasing international demand in places that emphasize data security.

Key Risks

Low Adoption

As of August 2025, AI21 faces competition from generalized foundational model companies like OpenAI and Anthropic, whose adoption levels within B2B outstrip AI21 in most aspects.

According to Ramp’s April 2025 AI Index and OpenAI’s own report, over 2 million businesses and roughly 32.4% of U.S. businesses were paying for OpenAI subscriptions. In comparison, only 8% of businesses use Anthropic, 3% use DeepSeek, and a mere 0.1% of them are subscribed to Google AI. Though there is no public data available on the percentage of AI21 customers, it is likely even smaller than Google AI’s share.

Source: Ramp

Despite being an early mover — AI21 began researching LLMs in 2017 — they had yet to make a breakthrough in market share. While newer entrants like xAI, DeepSeek, and Kimi have captured user excitement, AI21 has yet to penetrate deeply into the mainstream enterprise layer.

Generalization vs. Specialization

AI21 is caught in a two-front war: general-purpose models by OpenAI and Anthropic are rapidly expanding into enterprise, while specialized tools are capturing niches that AI21 hopes to target.

In healthcare, companies like Ambridge, Ambience, Heidi, and Eleos Health are becoming widely adopted. In legal, Harvey, Everlaw, and Spellbook support legal research and trial preparation. In finance, Numeric, Klarity, Arkifi, and Rogo are transforming accounting and investment workflows. Each of these specialized vertical players targets industries where precision and accuracy matter and where AI21 hopes to win.

Even within horizontal enterprises, teams are turning to department-specific tools over generalized models. For product engineering, Cursor and Cognition lead the way. For support chatbots, Decagon, Aisera, and Sierra’s agents interact directly with customers. Glean and Sasna interface between email, messages, and documents to create comprehensive enterprise searches. Otter.ai, Fireflies.ai, and Granola create meeting summary notes.

In this fragmented landscape, AI21 needs to compete with both general-purpose models and the distribution power of general LLM providers like OpenAI and specialized vertical AI tools like Sasna.

Talent Attrition and Lack of Funding

AI21 significantly lags behind competitors in funding, making it harder to attract and retain talent in a market where researchers are scarce. With Meta paying over $300 million to poach top researchers, compensation and access to compute have become key differentiators.

AI21 Labs raised $300 million in its latest round as of August 2025, while Anthropic, OpenAI, and Cohere raised $3.5 billion, $40 billion, and $500 million in their most recent rounds, respectively. This limits their ability to compete on scale.

Cofounder Shoham acknowledges the gap in funding, saying,

“Money is important, but it's not everything. We’re not terribly interested in creating a model that can draw a donkey on the moon. We don’t play the game of throwing every piece of data we can find onto a mass of processors and seeing what happens, and launching a chatbot that’s just a crowd-pleaser along the way. Our technology, which is highly intelligent, doesn’t require that kind of wastefulness. The amount we’ve raised so far-$336 million-is still money. But it’s not billions.”

Limited capital may still restrict AI21’s ability to hire aggressively and keep pace with better-funded rivals.

Shifting Buyer Behavior and Headwinds

As competition intensifies, consumer trends are also becoming unfavorable to AI21. In June 2025, an a16z report found that enterprises are increasingly favoring closed-source models as they become cheaper. As one Google customer said, “The pricing has gotten appealing and we’re already embedded with Google: we use everything from G Suite to databases, and their enterprise expertise is attractive.”

In addition, while many companies have adopted multiple models, the rise of agentic workflows has increased switching costs. This has pushed enterprises towards a single-vendor ecosystem and making it harder for AI21 to compete against deeply entrenched and highly adopted incumbents.

Source: Lightspeed

At the same time, model evaluation has also become more transparent and externalized, powered by platforms such as LM Arena. Yet, as of August 2025, AI21’s models rank at #91 on LM Arena, far below leaders like OpenAI and Google and making it hard for enterprises to trust its results.

Summary

AI21 Labs was founded by three entrepreneurs and academics to build reliable AI systems for enterprises. Its core offerings include open-weight foundational models optimized for long-context tasks, alongside Maestro, an agentic orchestration system that emphasizes accuracy and traceability. The company targets industries with low tolerance for error, like finance, law, and healthcare, with its current customer base largely concentrated in Europe and Israel.

While the enterprise AI market is expanding quickly, adoption is still early-stage due to concerns over privacy and accuracy. AI21 positions itself as a specialized, reliability-focused alternative to general-purpose incumbents like OpenAI and Anthropic.

As the battle over foundational models intensifies, it remains to be seen whether players like AI21 can carve out sizable market share in a space increasingly shaped by scale, distribution, and switching costs. While its early focus on long-context inference and orchestration gave it a head start, competitors like Perplexity, Exa, and OpenAI’s Deep Research unit are already moving in. The company’s future will depend on whether it can turn early partnerships into deeper, scaled deployments before orchestration becomes a commodity.

Thesis

Founding Story

AI21 Labs was founded in 2017 in Israel by three entrepreneurs and academics: Amnon Shashua (Chairman & Co-Founder), Yoav Shoham (Co-CEO & Co-Founder), and Ori Goshen (Co-CEO & Co-Founder).

Product

AI21 Labs is an applied research lab and generative AI company with two main product lines: Wordtune for consumers (B2C) and foundational large language models for enterprise clients (B2B).