Report: Thinking Machines Lab Business Breakdown & Founding Story

Thesis

In the period from 2020 to 2025, AI labs used 500% more compute each year and increased training spend in USD by 350% in order to improve their frontier models. However, the speed at which GPUs process standard AI training data is improving by only about 35% per year, and up to 30% of training power is simply wasted. Demand for compute is growing 14 times faster than hardware is improving, making efficient compute a growing imperative.

Open-source models show promise towards making efficiency gains. Because their weights are reusable and transparent, open models allow developers to improve on prior work rather than retrain from scratch. For example, in September 2023, Mistral-7B outperformed Meta’s Llama-2 13B on all benchmarks tested despite having half the parameters. As a result, some regulators are pushing for more open-source models. Notably, the EU’s AI Act exempts certain open-source AI models and developers from many new compliance requirements.

Though accelerated innovation may be a benefit of open-source, it needs infrastructure that trains, fine-tunes, and reproduces models efficiently. Open-weight systems invite thousands of researchers and developers to experiment, but without scalable, reliable training software, teams end up rebuilding ad hoc pipelines, repeating compute-intensive runs, and producing results that others can’t verify. That can mean higher costs, slower iteration, and greater compliance risk. Therefore, open-source AI progress depends on two compounding challenges: making each GPU go further and coordinating the explosion of open-weight experimentation.

Thinking Machines Labs intends to build an AI infrastructure layer between model developers and raw GPUs. Tinker, its first product, is a developer platform meant to make training and adapting large models easier and more efficient. It manages distributed training behind the scenes by optimizing scheduling, synchronization, and recovery so GPUs stay active rather than idle and resources are shared across runs. By abstracting these low-level systems, Tinker minimizes duplicated compute in order to make experiments reproducible, allowing researchers to fine-tune instead of retrain from scratch, resulting in higher utilization, lower energy use, and faster iteration at lower cost.

Founding Story

Thinking Machines Lab (TML) was founded in February 2025 by Mira Murati (CEO), Barret Zoph (former CTO), John Schulman, Lilian Weng, Andrew Tulloch, and Luke Metz.

In September 2024, Murati, then CTO of OpenAI, announced her departure from the company on Twitter. Her resignation letter stated that she wanted to create “time and space for her own exploration.” Although she began fundraising with significant investor interest less than a month after leaving the company, sources close to Murati said there was no clear product vision at that early stage. It was evident, however, that Murati was focused on recruiting top research talent, especially from OpenAI. “People who like to do research are being forced to do product,” said one former employee of OpenAI at the time, explaining why Murati’s new venture was attractive.

By January 2025, Murati had recruited what would become the cofounders of TML, a group of co-founders who had all previously been at OpenAI: Schulman, Zoph, Weng, Tulloch, and Metz. Schulman co‑founded OpenAI and helped architect ChatGPT; Zoph served as a VP of Research focused on post‑training; Weng was a VP of Research focused on safety; and Tulloch and Metz were senior technical staff working on large‑scale training infrastructure and evaluation systems. By February 2025, TML had recruited roughly 30 researchers from top labs, 20 of whom came from OpenAI.

Also in February 2025, Murati publicly announced Thinking Machines Lab on Twitter and outlined three priorities for the company: helping people adapt AI systems to their specific needs, developing strong foundations for more capable models, and fostering a culture of open science that benefits the field.

In August 2025, TML fell subject to an aggressive recruiting push by Meta, which approached more than a dozen TML researchers with outsized offers, including a reported multi‑year $1.5 billion package for Tulloch. Notably, unlike Meta’s recruiting efforts at other labs, all TML employees swiftly rejected the offers, at least initially. Nevertheless, although Tulloch initially declined, he later accepted Meta’s offer, marking the first co-founder to leave the company within a year of its founding.

In October 2025, TML released its first product, Tinker, a managed fine-tuning platform that lets developers efficiently adapt open weight models to their own data without managing distributed training, giving the public their first insight into what the lab is building.

In January 2026, two additional co-founders of TML, Barret Zoph and Luke Metz, simultaneously departed the company and rejoined OpenAI along with researcher Sam Schoenholz. Zoph, who was CTO, was fired on January 14, at which time Murati announced the company’s new CTO would be Soumith Chintala at 4:13 PM. At 5:11 PM, less than an hour later, OpenAI CEO of Applications Fidji Simo announced that Zoph had joined OpenAI and would be reporting to her, and that Metz and Schoenholz had also joined and would be reporting to Zoph.

Later reporting on the matter indicated that the rift between Murati and Zoph had arisen in the previous summer, when it was revealed that Zoph had entered into a relationship with a colleague in June 2025 after he was confronted by Murati about the matter, after having initially denied it. Zoph reportedly told Murati he had been “manipulated into the relationship”. Afterwards, he took a period of leave from work and returned in July; according to unnamed executives at TML, there was a drop off in Zoph’s performance after his return.

According to Zoph himself, he reached out to Sam Altman about returning to OpenAI as early as October 2025, soon after Tulloch had left TML to return to Meta. The week of their departure, Zoph reportedly met with Murati along with Metz and Schoenholz. The trio expressed unhappiness with the direction of the company and proposed that Murati assign final say over technical decisions to Zoph. Murati responded that she was unhappy with Zoph’s productivity in recent months. Prior to this conversation, the three had already been holding talks with both OpenAI and Meta about collectively joining either company, which they later did.

All told, in the four-month period between October 2025 and January 2026, TML lost three of its original six co-founders, including its CTO, along with one senior researcher, to a competing AI company; all of this occurred less than a year after the company had been founded and despite the fact that the company had had raised money at a $12 billion in July 2025, and was in talks to raise at as much as $50 billion by November 2025.

Product

Product Philosophy

TML designs its AI training architecture to be flexible. The system is built from small, modular training building blocks that each perform one simple, well-defined function. Examples include data loaders, trainers, schedulers, reward modules, and fine-tuning loops. Because each block has a narrow job, teams can mix and match them to support different workflows such as reinforcement learning, custom training loops, or domain-specific adaptations.

This structure gives developers the ability to change how a model learns without needing to rebuild the entire training stack. TML takes care of the operational complexity beneath the surface. This includes scheduling, synchronization, cluster management, and recovery. At the same time, developers keep control over the objectives, data, algorithms, and constraints that matter to them. The long-term goal is to support AI systems that remain steerable by human operators rather than systems that behave as fixed, closed boxes.

This philosophy naturally leads TML toward an infrastructure-first approach. TML views model development and infrastructure design as tightly connected. Scaling capabilities and ensuring safety depend on the same underlying mechanics, so both must be approached together. In practice, TML integrates GPU orchestration, multi-node synchronization, deterministic kernels, logging, and automated recovery into a single coherent platform layer. Deterministic and observable infrastructure makes experiments reproducible, security policies enforceable, and failures diagnosable. Researchers can explore new learning algorithms and training methods while depending on the platform to keep runs stable, auditable, and consistent across scale.

Tinker

Source: Thinking Machines Lab

As large language models grow to hundreds of billions of parameters, adapting them for specific research or enterprise tasks has become increasingly complex. Thinking Machines Lab’s core product, Tinker, is a flexible API and managed cloud platform designed to make this process accessible and efficient. Launched in October 2025, it is a researcher-first alternative to closed, black box fine-tuning systems. Researchers define their own training logic in Python while Tinker executes it at scale. From a single laptop, teams can fine-tune frontier-scale AI models while Tinker automatically manages all of the behind-the-scenes computing, including setting up GPUs, connecting multiple machines, handling errors, and scaling resources as needed.

Historically, fine-tuning large models required building and maintaining complex computing setups before research could even begin. Teams rented or assembled GPU clusters, aligned software drivers, and wrote scripts to coordinate training across machines. Jobs frequently failed mid-run due to hardware errors or synchronization bugs, forcing long restarts and data loss. Scaling to larger models demanded deep DevOps expertise, constant monitoring, and careful fault recovery. This infrastructure burden slowed iteration and made advanced experimentation accessible only to large, well-funded organizations. Tinker removes this bottleneck, allowing effort to focus on developing new training methods rather than maintaining infrastructure.

At its core, Tinker follows a “one API, any scale” principle. Switching among open weight model families such as Llama, Qwen, and Mixture of Experts (MoE) variants up to 200 billion parameters can be done with a single configuration change. The system manages orchestration, retries, and fault tolerance while still exposing the key settings for custom supervised fine tuning (SFT), reinforcement learning from human feedback (RLHF), reward modeling (RM), and experimental training loops.

A cornerstone of the platform is parameter-efficient fine-tuning (PEFT) through LoRA, a method that adjusts only a small set of extra parameters instead of updating every weight in a large model. In practical terms, this approach teaches a model new skills by adding lightweight adapters rather than retraining it from scratch.

This makes training faster, cheaper, and less resource-intensive while still reaching near full-quality results. Multiple teams can also run their fine-tuning jobs at the same time on shared hardware. To help new users start quickly, the Tinker Cookbook provides open-source, editable recipes for SFT, RLHF, and RM. Access is invite-only, with a roadmap for automated misuse detection to ensure responsible use.

The result is a unified system that functions as both a research workbench and a production platform, preserving algorithmic flexibility while removing the infrastructure barrier that once limited access to large-scale experimentation. It is also worth noting that Tinker’s design is grounded in open source research, which it publishes through its blog. For instance, the publication LoRA Without Regret showed that LoRA can achieve the same performance as full fine-tuning, validating PEFT as a default mode.

Market

Customer

Thinking Machines Lab targets companies seeking precise control, openness, and reliability in model training. Its core customers include academic labs and applied research teams that use open-weight LLMs, want to fine-tune on proprietary data, and want to retain ownership of models, code, and deployment choices. Early adopters such as Princeton’s Gödel team, Stanford’s Rotskoff group, UC Berkeley’s SkyRL lab, and Redwood Research use Tinker for workloads like building reasoning models, multi-agent RL, and backdoor detection, where control over algorithms and visibility into training are more important than a fully abstracted service.

Customers may be drawn to TML as an alternative to the two other options. First, closed fine-tuning APIs do not provide sufficient control over training procedures, while open-source stacks built on tools like Hugging Face, PEFT, and MosaicML require significant effort to manage multi-node training, GPU utilization, fault tolerance, and experiment tracking. Tinker is positioned to capture demand by providing a platform that preserves low-level flexibility while removing infrastructure work.

Market Size

The market for LLM infrastructure is expanding rapidly, in step with the growth of AI model scale and enterprise adoption. Of 700 global technology leaders surveyed by McKinsey in April 2025, a majority reported using open-source or partially open AI technologies in production across the seven layers of the AI stack, including data, hosting, APIs, and application tooling. The same survey found that organizations that view AI as strategically critical are over 40% more likely to use open-source models, and 72% of technology-industry respondents report active use of at least one open-source model.

Similarly, the Linux Foundation reported that approximately 63% of companies used open-source AI as of June 2025, and 89% of companies that had adopted AI incorporated open-source tooling in their stack. This indicates that the potential customer base for fine-tuning infrastructure is not limited to frontier-model specialists. It includes AI-native companies that are being created precisely because open models are available. As of 2024, there were an estimated 1.2 million ML engineers and researchers globally, along with thousands of AI startups and between 5K- 10K enterprise AI teams, all of which form the core addressable market for customizable model infrastructure.

In July 2025, it was reported that enterprise spending on LLM API usage more than doubled in six months, rising from roughly $3.5 billion to $8.4 billion, even when consumer products and packaged enterprise applications are excluded. Yet only 13% of AI workloads at the time were running on open-source models, compared with 19% six months earlier.

Competition

The market for model training infrastructure is shaped by two main groups: black-box API providers like OpenAI, Anthropic, and Google DeepMind, and open infrastructure platforms like Hugging Face and MosaicML. Black-box APIs offer reliable performance and simple integration but restrict access to weights, limit fine-tuning, and reduce reproducibility, which makes them less suitable for teams that need deep control over models. Open infra platforms support open weights and flexible deployment on user-managed compute, but require significant expertise in cluster operations, networking, and cost management. TML positions itself between these options by providing a managed environment for open models that preserves control over training and evaluation while reducing infrastructure overhead.

Black-Box Model Providers

OpenAI: Founded in 2015 in San Francisco, OpenAI develops closed-source models including GPT-4, DALL·E, and Whisper. Backed by SoftBank, Dragoneer, and Microsoft, the company raised $40 billion in March 2025 in its Series F, which valued the company at $500 billion. As of January 2026, OpenAI has raised $79 billion in total funding. OpenAI’s APIs provide high reliability and performance but no access to model weights or training internals, making them more suited to production teams rather than research users.

Anthropic: Founded in 2021, Anthropic builds the Claude family of language models with a focus on safety and alignment. The company raised $33.7 billion in total prior to January 2026, when it raised a further $10 billion at a $350 billion valuation. Its API-based model access offers strong guardrails and uptime but limited transparency or customization.

Google DeepMind: Acquired by Google in 2014 for $500 million, DeepMind develops the Gemini series of multimodal models integrated into Google Cloud. DeepMind focuses on advancing model capabilities and embedding them into Google’s ecosystem, rather than offering open-weight access.

Open Infrastructure Platforms

Hugging Face: Founded in 2016, Hugging Face has raised a total of $395.2 million from investors, including Sequoia, Coatue, and Lux Capital, as of January 2026. It serves as the leading open-source hub for hosting, training, and sharing AI models. Users retain full control over weights and compute, but must manage their own infrastructure and scaling.

MosaicML / Databricks: Founded in 2021 and acquired by Databricks in 2023 for $1.3 billion, MosaicML built the Composer training library and platform for large-scale model development on user-managed compute. While it enables efficient open-weight training, teams handle all infrastructure setup, making it best for organizations with strong in-house ML ops talent.

Business Model

Tinker is currently on a waitlist. Exact pricing details remain unknown, but tweets from TML employees suggest that Tinker is priced on a credit system. A credit system for Tinker could meter the work the platform performs, including scheduling, synchronization, data movement, and recovery, so customers pay for orchestration rather than raw GPU time. Similar companies already use similar models with transparent benchmark pricing. Snowflake, a cloud data warehouse that meters compute separately from storage, charges roughly $2–$4 per credit depending on region and tier, with each credit representing a unit of compute used by a virtual warehouse. Cloud providers apply similar metering: Google’s GKE Autopilot, a managed service that runs and scales containerized applications, charges for control-plane management at about $0.10 per vCPU-hour equivalent, and AWS EKS, a similar managed service for running large containerized workloads, bills $0.10 per cluster-hour for control-plane operations.

Traction

Tinker is in private beta with access via waitlist. Early users include leading AI research groups at Princeton, Stanford, UC Berkeley, and Redwood Research. Princeton’s Gödel Team used Tinker to fine-tune a mathematical reasoning model that achieved 88.1 % MiniF2F accuracy, matching full fine-tuning performance and outperforming larger closed models through self-correction. At Stanford, the Rotskoff Lab fine-tuned Llama 70B for chemistry reasoning via reinforcement learning, increasing accuracy from 15% to 50%. At Berkeley, SkyRL leveraged Tinker’s flexible primitives to train multi- agent reinforcement learning systems with asynchronous off-policy updates and tool use behaviors. Redwood Research fine-tuned Qwen 32B for long-horizon control, relying on Tinker’s automatic multi-node scaling.

Valuation

In June 2025, TML raised $2 billion in a seed round led by Andreessen Horowitz, with participation from NVIDIA, Accel, Cisco, and AMD. The round valued the company at approximately $12 billion post-money. This has been its only fundraising to date. In November 2025, TML entered into further fundraising discussions at a $50 billion valuation. Although nothing is finalized, if completed, the new round would quadruple the company’s valuation despite the fact that TML remains pre-revenue.

This is a continuation of a longstanding pattern: capital markets concentrate early, at scale, around perceived infrastructure winners during major technological transitions. During Britain’s Railway Mania in the 1840s, investors funneled capital equivalent to 7% of GDP into railroads before any profits were realized. The same cycle repeated in the 1990s when telecom giants spent billions to lay fiber optic cables. Only a small fraction of that fiber was ever lit during the boom years, but the resulting capacity became essential to the modern internet. In the early 2000s, cloud infrastructure followed a similar trajectory. Companies like Amazon, VMware, and Cloudera were funded years before profitability, on the belief that owning the system layer would yield enduring control.

Tinker could use the same economic structure by operating as a control plane for distributed AI training. It manages reproducibility, checkpointing, orchestration, and fault tolerance across compute clusters. For most teams, implementing frontier-scale model development independently would be prohibitively complex and resource-intensive. Tinker could provide a standardized layer that abstracts this complexity, much like AWS did during the early years of cloud computing. TML’s early valuation, therefore, doesn’t come from its revenue at the time of the funding, but from its potential role in defining how models will be trained, audited, and deployed in the future.

Key Opportunities

Open-Weight Model Adoption

Open-weight models accounted for 50% of notable model releases from 2019-2023, and over 96% of businesses plan to deploy open-weight models by 2027. Organizations are adopting open systems to reduce costs and gain control over customization. This shift creates demand for infrastructure that can manage reproducible, fine-grained model adaptation across a growing variety of open weight architectures. Tinker can address this need by providing standardized training workflows, controlled data and parameter management, and consistent evaluation pipelines. By becoming the default control plane for open model development, Tinker can capture long-term value as enterprises transition away from vertically integrated proprietary APIs toward modular, transparent ecosystems.

Building for Reproducible and Transparent AI

As AI systems move into regulated sectors, reproducibility has become a critical operational and compliance requirement. The EU AI Act mandates traceable documentation for high-risk applications. Industry analyses estimate that nondeterminism in training pipelines leads to reproducibility errors in model redeployments, creating measurable productivity losses.

Tinker’s deterministic kernels, batch invariant training, and integrated experiment tracking provide a structured record of every training run, including data versions, configurations, and hardware context. This approach allows models to be independently verified and reproduced, reducing operational risk while meeting emerging regulatory expectations. Over time, this foundation can position Tinker as core infrastructure for reliable, transparent AI development across research and enterprise environments.

Key Risks

Talent Retention & Executive Turnover

Like most other AI labs, TML faces significant challenges when it comes to talent retention. However, this has been compounded by an unusual level of executive turnover. Meta’s reported 2025 raid on TML, offering compensation packages to multiple researchers exceeding $200 million, which led to the October 2025 departure of co-founder Andrew Tulloch, was one example.

Further leadership turnover was precipitated by reported disagreements about the direction of the company, including reports of failed acquisition talks in 2025. These disagreements, along with interpersonal issues, led to the January 2026 departure of two additional co-founders (including the CTO) to OpenAI, along with one senior researcher. This episode left the company with only half of its original founding team remaining, less than a year after it had been founded.

Given the rapid success in the company’s ability to fundraise and initial success in attracting attention and talent, the executive turnover it has already experienced not only presents a risk to the business in its own right but also presents larger questions about the company’s long-term viability, especially given that the company remains at pre-revenue despite its high valuation and funding.

Unproven Paradigm of “Learning Over Scaling”

TML is betting that the first superintelligence will be a “superhuman learner” and that current scale-first paradigms will fail without systems that can continuously learn, adapt, and improve from experience. That is a sharp deviation from the dominant strategy of OpenAI, Anthropic, and DeepMind, which have already demonstrated compound gains from scaling model size, data, and compute.

The problem is that TML’s thesis is not yet backed by large-scale empirical evidence. Meta-learning, curriculum-style training, and self-improving agents work in narrow or game-like settings; no one has shown that a general-purpose, foundation-scale system can reliably learn how to learn across domains at an industrial scale. TML’s own researcher publicly acknowledged that reaching such systems will require major breakthroughs in memory, optimization, data, and training environments and that “this is going to be very difficult,” without offering any concrete timeline for when those breakthroughs may materialize.

TML’s bet is that earning-centric methods will beat pure scaling, and TML will be the team that operationalizes them first. If it turns out that scaled reasoning models plus incremental RL and tooling remain “good enough” for most applications, TML may spend a large portion of its capital on a paradigm that lags incumbents on real-world benchmarks, reliability, or adoption. Because the company is positioning infrastructure, product, and research roadmap around this thesis, there is no obvious fallback. In the worst case, the learning-over-scaling thesis remains scientifically interesting but commercially late, and TML finds itself with a world-class research agenda that has not translated into a defensible product advantage.

Summary

Thinking Machines Lab (TML) was founded to modernize training and adaptation of large AI models by replacing ad hoc research stacks with a managed, reproducible platform. Its product, Tinker, abstracts distributed training, scheduling, and recovery while preserving low-level control, which improves GPU utilization and shortens iteration cycles. TML serves researchers, startups, and enterprise AI teams working with open-weight models across varied domains. The company is positioning itself around efficiency and auditability, though building frontier-grade infrastructure is technically demanding and capital-intensive. TML can strengthen its position by expanding supported model families and post-training methods, deepening LLMOps integrations, and packaging compliance and provenance features for enterprise buyers.

Thesis

Founding Story

Thinking Machines Lab (TML) was founded in February 2025 by Mira Murati (CEO), Barret Zoph (former CTO), John Schulman, Lilian Weng, Andrew Tulloch, and Luke Metz.

Thinking Machines Lab

Tags

Reading Time

Reading Time

Thesis

Founding Story