Report: Synthesia Business Breakdown & Founding Story

Thesis

As enterprises shift from static documentation and live meetings toward asynchronous formats, video is becoming a core medium for internal training, onboarding, and communication. Video significantly outperforms text in retention, with viewers retaining 9x more information, and short-form video platforms like TikTok, which grew from 689 million users in 2021 to 1.9 billion in 2025, have normalized fast, visual content as the standard.

For marketing use cases, in particular, video adoption is already widespread, with 89% of companies using video in marketing as of early 2025. However, internal use remains underdeveloped, specifically constrained by long production timelines, high costs, and limited technical resources. In one survey, 16% of learning and development (L&D) teams said they hadn’t created training videos due to cost or skill gaps.

Synthesia offers a browser-based platform that turns text into professional-quality, multilingual videos using AI avatars and voice synthesis. It eliminates the need for cameras, studios, or editing expertise, reducing production time by 62% and lowering the skill threshold needed for content creation. By targeting internal communication, where speed, localization, and brand control matter more than cinematic quality, Synthesia is competing to become the default platform for enterprise internal video communication.

Founding Story

Synthesia was founded in 2017 by Victor Riparbelli (CEO), Steffen Tjerrild (COO), Lasse Korsgaard, Matthias Niessner, and Professor Lourdes Agapito.

Riparbelli moved to London in 2016 to explore emerging technologies. After reading a research paper by Niessner that demonstrated AI's potential to generate realistic video content, Riparbelli was inspired to make media production more accessible through AI. The founders aimed to “enable a 16-year-old sitting in their bedroom with a good idea to make a Hollywood film.” To that end, the company spent its first three years developing computer vision to improve lip-syncing for multilingual content, essentially building an AI dubbing tool. Riparbelli concurrently co-founded Coincall, a crypto portfolio tracker, which he later sold in 2019 to fully focus on Synthesia.

After around a year of development, Synthesia made its public debut in November 2018 with a high-profile proof of concept on the BBC. Using Synthesia's technology, news presenter Matthew Amroliwala was shown delivering a report in Spanish, Mandarin, and Hindi, all languages he did not speak. In 2019, the company partnered with the charity Malaria No More UK and ad agency R/GA to create a campaign video featuring David Beckham. In the video, Beckham speaks to viewers about the fight against malaria, in which he speaks nine different languages.

As Synthesia secured its pre-seed funding, including a $1 million round with an investment from Mark Cuban, the company began attracting pilot projects and early customers from large companies. Global brands such as Accenture, McCann Worldgroup, and the NBA’s Dallas Mavericks (owned by Mark Cuban) were experimenting with Synthesia’s platform.

While initially created for dubbing, Synthesia took a strategic pivot to focus on enterprise video communications, with the understanding that dubbing limited it to being a service-based provider. The company, instead, had the ambition to try and democratize video production more broadly. Today, Synthesia's mission focuses on “[making] video creation easy for everyone” by overcoming traditional barriers like equipment and actors.

Product

Synthesia is a browser-based platform for creating professional videos without cameras, actors, or editing software. All functionality is available through the Synthesia studio environment.

AI Avatars

Synthesia’s AI avatars replace traditional on-camera presenters by simulating human appearance, speech, and expression using computer-generated models. Synthesia’s deep learning models handle two main tasks: voice synthesis, which produces speech from text in a selected language and tone, and visual rendering, which animates a digital avatar’s facial movements to match the audio.

Source: Tech Pilot

As of May 2025, Synthesia offers a growing library of over 230 digital avatars that can be used in professional video content. These avatars span a range of ethnicities, ages, and professions, and support branded elements like custom clothing, which can be created through the Avatar Builder.

Users can choose between pre-built Expressive Avatars and custom Personal Avatars, which are trained on consensual footage to replicate a specific person’s appearance and voice. The workflow involves inputting a script and selecting an avatar to produce automatic lip-syncing and visual animation.

According to one Synthesia investor, the company has built an AI applied research team led by Agapito and Nissner, who are both established AI academics, to build proprietary AI models around “AI-generated life-like avatars.” In addition, Synthesia provides proprietary models for users to create and train “a digital twin of themselves that can be featured in videos.” In addition to the company’s own proprietary models, Synthesia also uses “third party foundational model technology opportunistically.”

Synthesia trains its proprietary AI video models using a curated set of human performances sourced from three categories: paid actors and models it directly commissions, publicly available content, and licensed media. This training approach is designed to capture the nuanced expressions and movements that make avatar performances appear realistic and engaging. All individuals whose likeness or voice is used must give explicit consent, whether for broadly available stock avatars or customer-specific avatars. Synthesia maintains documented provenance of its training data and ensures that customer data is never used for model pre-training, with any fine-tuning requiring written instruction from the customer.

Text-to-Video Engine

Synthesia’s core engine converts written prompts or scripts into realistic AI videos. Users can start with a script or use the platform’s AI script generator, which generates scripts from text prompts, files, or URLs. The AI script generator follows a three-step process:

Users write the topic of their script in a few lines
Users describe the objective of the video, including the type of video, target audience, and their desired outcomes
Users select a suitable tone of voice to fit the video

Synthesia pairs scripts with 400+ synthetic voices across languages, adjustable for pitch, speed, and emotion. As of May 2025, the system supports over 140 languages and includes instant translation and AI dubbing features. The platform can generate a one-minute video in 9–15 minutes, but has creative constraints compared to traditional filming, including limited voice variation and layout flexibility.

Source: Synthesia

Video Editor

The platform includes a web-based editor modeled after slide presentation tools. It allows users to customize fonts, colors, backgrounds, and other visual elements on a scene-by-scene basis. The editor includes over 250 pre-designed templates spanning training, onboarding, and product marketing. To support brand consistency, the editor includes a built-in brand kit for teams to standardize logos, color palettes, and fonts across projects.

Source: Synthesia

Screen Recorder

Synthesia’s Screen Recorder is a Chrome extension that allows people to record screens in-browser, capturing internal audio and/or microphone input. Its capabilities extend beyond native screen recorders, allowing for blurring sensitive on-screen content and post-recording editing that transcribes and allows you to easily edit your speech.

Voice Cloning

Synthesia’s voice cloning allows users to replicate and pair their voices with any AI avatars. It can generate a personalized voice model in 10-15 minutes as of May 2025. Cloned voices can pair with any avatar and speak in 29 different languages.

Collaboration & Compliance

Synthesia’s Live Collaboration feature, launched in February 2024, enables real-time and asynchronous editing. Teams can see each other's cursors and selections, leave comments, manage access permissions, track version history, and organize projects with Workspaces. This improves editing speed and creativity through parallel editing and coordination across time zones.

For compliance, Synthesia is SOC 2 Type II compliant, GDPR-compliant, and enforces stringent consent requirements for developing personal avatars. It is also a launch partner of the Partnership on AI Responsible Practices for Synthetic Media and a member of the Content Authenticity Initiative, which includes Adobe, Nvidia, and Microsoft.

In late 2024, Synthesia’s content moderation systems were tested by the National Institute of Standards and Technology (NIST) and Humane Intelligence. Out of 40 non-consensual deepfake attempts and over 75 harmful content attempts, Synthesia successfully banned all unauthorized avatar creations and 74 harmful content attempts, confirming its approach under NIST’s AI-600-1 risk framework.

Synthesia Academy

Synthesia Academy offers self-paced courses, certifications, and Synthesia-generated tutorial videos, and is open for anyone to sign up. Users can also attend live "Feature Friday" sessions hosted weekly by Synthesia staff and access them afterwards through Synthesia Academy.

Market

Customers

Synthesia targets primarily enterprises and business teams. As of early 2025, its customer base spans over 200K users across over 55K companies, including over 60% of Fortune 100 firms. Fortune 100 clients like Amazon, Johnson & Johnson, IKEA, or Accenture often start with a specific use and expand to broader usage once the value is proven. Thousands of mid-market and small businesses are also adopting Synthesia. The availability of individual and creator plans makes the tech accessible to startups, educators, and individual content creators. Overall, the most popular use cases are corporate: training, how-tos, onboarding, company updates, and sales pitches.

Market Size

The AI video generation market was estimated at $550 million in 2023 and projected to reach nearly $2 billion by 2030, a roughly ~20% CAGR. The broader enterprise video market is expected to grow to $31 billion by 2032, providing a large addressable market for AI video generation solutions to capture. Some estimates, when taking into account growth across categories like marketing, education, training, entertainment, and media more broadly, estimate that the market for AI video generation could reach $63 billion by 2034.

Several factors are driving this expansion. The simple reality of increasing demand for scalable, cost-effective video content in sectors such as marketing, education, and corporate training is a primary driver. Advancements in AI capabilities, including machine learning and natural language processing, have enhanced the capabilities of AI video generators, making them more accessible and efficient. Additionally, the rise of remote work and the need for personalized digital communication have further accelerated the adoption of AI-driven video solutions.

Competition

As of 2025, the AI video creation market remains early and unconsolidated, with startups racing to capture the space. Synthesia’s core differentiation lies in its studio-grade avatar quality, enterprise safety (e.g., SOC 2, SSO), as well as voice and language support. Synthesia faces pressure on three fronts: (1) AI-native startups building cheaper avatar solutions, (2) broader generative video platforms, and (3) larger adjacent platforms like Canva expanding into AI video tools or AI video editors offering more video generation capabilities. As competitors continue to improve in quality and compliance, Synthesia’s traditional advantages could matter less in purchase decisions. Synthesia’s challenge is to maintain its enterprise focus while keeping pace with user-friendly, fast-moving startups and expanding into new content formats.

Direct Competitors

HeyGen: Founded in 2020, HeyGen is a US-based startup that has gained popularity for its talking avatar videos, often used in marketing and social content. In a June 2024 Series A, the company raised $60 million at a $500 million valuation from investors, including Thrive Capital and BOND, bringing its total funding to $69 million. As of May 2025, HeyGen offers over 120 AI avatars, 300 voices, and 300 templates for various use cases. The platform also provides AI voice clones and integrates with Zapier, appealing to businesses seeking quick, user-friendly AI-generated videos.

Colossyan: Founded in 2020, Colossyan is a European platform focused on AI video for workplace training. As of March 2025, it offers over 150 avatars, supports 70 languages, and emphasizes collaborative team features. The company reached a total funding of $28.2 million, including a $22 million Series A round in 2024 led by Lakestar, with backing from Launchub, Day One Capital, and Emerge Education. In 2024, the company saw a 155% YoY revenue increase. Unlike Synthesia, Colossyan positions itself more narrowly around internal learning & development use cases.

Elai.io: Founded in 2020, Elai.io is a US-based company that was acquired by Panopto in October 2024. Elai.io offers an AI-powered video generation platform that enables users to create videos from text inputs. The platform features customizable digital presenters, supports over 75 languages, 80 avatars, and 450 voices. The product also includes tools like voice cloning and interactive elements. As a direct competitor to Synthesia, Elai.io focuses on corporate learning and development, providing solutions for training, onboarding, and internal communications. While both platforms offer AI-generated avatar videos, Elai.io emphasizes interactive features such as quizzes and branching scenarios, catering specifically to educational and training contexts.

Hour One: Founded in 2019, Hour One is a US-based company that provides AI-generated videos featuring virtual human characters. The company raised a $20 million Series A in April 2022 from Insight Partners and is a direct competitor to Synthesia. Hour One serves businesses looking to produce professional-grade videos with AI avatars for diverse use cases. The company’s customers include Lowe’s, HP, Novartis, Johnson & Johnson, and AstraZeneca. Hour One offers a broad range of virtual characters across over 100 languages and voices, and focuses on integrating AI videos into various business functions, whereas Synthesia primarily targets corporate training and internal communications.

Generative Video Companies

While generative video companies are typically focused on higher volume and complexity video operations, their exposure to the core components of generating video makes them potential competitors to Synthesia over time if they were to choose to adopt their technology for corporate use cases.

Runway: Founded in 2018, Runway is a US-based generative video company that has raised over $540 million in funding, including a $308 million round of funding in April 2025 at a $3 billion valuation from General Atlantic. Other investors include Google, Nvidia, and Salesforce Ventures. Runway develops generative AI models for video creation, including its Gen-4 model, which allows users to generate videos from text prompts or images. The platform is used by creative professionals for tasks like video editing, visual effects, and animation. As opposed to internal and corporate use cases, Runway positions itself in the creative industry, serving filmmakers, content creators, and artists seeking advanced AI tools for multimedia production.

Pika Labs: Founded in 2023, Pika is a US-based generative video company that has raised $135 million in funding, including an $80 million Series B in June 2024. Pika Labs offers an AI video generation platform that transforms text prompts into stylized videos. The platform includes features like "Scene Ingredients," allowing users to build scenes from various visual elements. Pika Labs targets creative professionals and marketers seeking to produce short-form, stylized video content quickly. While Synthesia emphasizes realistic avatar videos for corporate use, Pika Labs focuses on creative expression, offering tools for generating imaginative and stylized videos.

Adjacent Competitors

Typically, these companies are established content creation platforms, whether visual design like Canva, or video editing like Descript or Veed. The existing install base these companies have with creators could represent a competitive threat as they begin to implement additional AI capabilities that could overlap with Synthesia’s offering.

In addition, a number of other adjacent platforms that are tied into the internal workplace communication stack could expand to include generative video tools. For example, Zoom is launching an AI companion that can produce video recaps from video calls. While not an avatar-based video call, these are expansions into the broader category of internal communications.

Canva: Launched in 2013, Canva is an Australian design platform with over 170 million users worldwide as of 2024. Canva’s Magic Media tool generates short clips from text prompts, though outputs are reportedly often generic. Canva raised $200 million at a $40 billion valuation in 2021 led by T. Rowe Price. While not a direct competitor within its core business, Canva’s scale and highly accessible UX make it a long-term threat if it doubles down on video, which could eventually allow it to undercut Synthesia on low-end use cases.

Descript: Founded in 2017, Descript is a US-based AI audio and video editing tool. The company raised a total of $100 million in funding from investors like Spark Capital, a16z, and Redpoint, including a $50 million Series C from OpenAI in October 2022. Descript allows users to edit media by editing text transcriptions. Features include overdub voice cloning and screen recording. While Synthesia focuses on generating videos from text with AI avatars, Descript provides tools for editing existing audio and video content, catering to different stages of content production.

Veed.io: Founded in 2018, Veed.io is a UK-based video editing tool with AI features like avatar generation and voiceover, targeting creators and SMBs. In its 2022 Series A, the company raised $35 million from Sequoia Capital, with total funding reaching approximately $40.7 million. Unverified estimates indicate the company’s valuation sits between $140 million and $210 million. Unlike Synthesia, Veed began as a general-purpose video editor and added AI features later, appealing to a broader audience.

Loom: Founded in 2015, Loom is an internal video sharing tool that was acquired by Atlassian in November 2023 for $975 million. Loom is a video messaging platform designed to facilitate asynchronous communication in professional settings. Loom enables users to record and share videos that combine screen captures, webcam footage, and audio narration. Its features include AI-generated titles, summaries, chapters, and task lists, as well as transcription capabilities in over 50 languages.

Loom’s acquisition by Atlassian aimed to integrate Loom's video messaging capabilities with Atlassian's suite of collaboration tools, such as Jira and Confluence, enhancing team communication and productivity. While Loom currently focuses on screen and webcam recordings, its development of AI features—like automated transcription and summarization—suggests potential for expansion into AI-driven video content creation. This trajectory could position Loom as a competitor to platforms like Synthesia

Business Model

Tiered Subscription SaaS Fees

Source: Synthesia

Synthesia operates a subscription-based SaaS model with a tiered pricing structure. For individuals or small teams, there is a free plan with up to three minutes of video, followed by a self-service Starter plan ($29/month), which includes a set number of video credits. This plan offers 10 minutes of video per month, access to over 125 AI avatars, and basic features like AI video assistance and multiple language options. Higher tiers, such as the Creator ($89/month) plan, offer more video minutes, avatars, and more.

Businesses pay higher subscription fees in exchange for higher volume usage, premium features, and support. Paid corporate plans include advanced capabilities, such as the use of custom avatars, collaboration features, single sign-on, and integration support (like LMS or CMS integrations). Synthesia offers demos and has an enterprise sales team targeting learning & development departments and marketing teams at large companies.

Synthesia also offers an API solution, priced on a consumption basis. Developers can embed Synthesia’s video generation into third-party applications.

Traction

As of January 2025, Synthesia had over 60K customers and was used by over 60% of Fortune 100 companies, including Amazon, Tiffany & Co., IHG Hotels & Resorts, and others. Synthesia’s enterprise offering, combined with its introduction of freemium tiering and high-profile marketing campaigns, drove a steep growth curve: by late 2021, Synthesia grew from over 4K clients in 2021 to over 50K by mid-2023, and around 55K by late 2024.

To support this growth, Synthesia’s team had scaled up to over 400 employees across seven countries as of January 2025. Key hires include Peter Hill, a former Amazon executive, who joined as the new CTO in January 2025. The global office presence indicates a push for international market coverage and access to talent. Despite being London-based, over half of Synthesia’s revenue comes from the US market.

As of May 2025, Synthesia held a 4.4-star rating average on G2 (over 2K ratings), Capterra (300 ratings), and TrustPilot (1.4K ratings). Historically, as enterprise tools expand, maintaining consistently high ratings and customer satisfaction in new verticals and/or geographies becomes difficult as user personas and expectations differ.

Valuation

Synthesia was valued at $2.1 billion post-money in its January 2025 Series D round. This was more than double its valuation from just 18 months prior; $1 billion in mid-2023 at Series C. As a UK-based tech startup, this valuation puts Synthesia in a unique position, becoming one of the most valuable AI startups in Europe. As of May 2025, the company had raised over $330 million in venture funding from investors including Accel, NEA, Kleiner Perkins, Google Ventures, NVIDIA, FirstMark, and others. The company’s $180 million Series D is largely being put towards R&D and global expansion.

Key Opportunities

Enterprise Video Standardization

Synthesia already serves a roster of large-company clients. However, as evidenced by their Electrolux case study, many of those deployments started in one department or use case. By entering larger companies this way, there is a big opportunity to land quickly and expand from within. The vision for Synthesia could be ubiquity in the enterprise as organizations become “video-first” and everyone uses the tool to create video messages. Becoming the default standard for enterprise video could dramatically increase the switching costs for users to switch away from Synthesia.

Expanding Beyond Avatars

Synthesia’s product has consistently focused on producing avatars speaking against a static background or image. This format is suitable for learning and onboarding, but lacks visual dynamism, making it less applicable for more customer-facing content. Generative video platforms like Runway’s Gen-2 and Pika Labs allow people to generate full scenes from text. Synthesia could expand to include b-roll, motion effects, or camera movement into the templates of its video editor. This would allow the company to expand potential use cases beyond internal comms into brand campaigns, allowing it to strengthen its position in a market defined by visuals.

International Expansion

Synthesia’s current customer base is concentrated in North America and Western Europe, though it has begun targeting Asia-Pacific markets like Japan and Australia. Its partnership with World Innovation Lab, a VC firm with strong ties to Japan, underscores this expanding regional focus. Markets like Japan spend heavily on corporate training, with over $3.7 billion spent in 2023. Expanding into underpenetrated, video-forward markets and leveraging Synthesia’s existing multilingual capabilities could significantly increase its enterprise footprint. That said, international expansion will likely require long sales cycles, local relationships, and more customization, which could delay traction in these areas.

Key Risks

Avatar Realism Ceiling

One of the obstacles with generative video is its inability to accurately portray certain features, instead displaying “subtle errors in physics, unnatural movements or inconsistencies that alert viewers that something isn’t quite right.” Viewers can sometimes tell it’s an AI avatar after watching for a bit, due to subtle facial stiffness or a lack of hand gestures. Synthesia’s Expressive Avatars update in April 2024 addressed some of these issues, but the experience falls short of natural realism. If the avatars stay stuck in the “uncanny valley”, Synthesia risks being limited to utilitarian content, while competitors that are quickly advancing more lifelike AI presenters could win higher-fidelity use cases.

Commoditization

One potential risk for Synthesia would be the company’s commoditization of generative video capabilities as open-access models or larger, more well-funded generative AI companies provide access to similar capabilities, even potentially at a lower cost. Though enterprise customers are generally quite sticky, with a 4.1% enterprise SaaS churn rate, if significantly better or cheaper options emerge, even Fortune 100 clients could move over.

As more competitors move to offer videos that are “good enough” for enterprise use, it could become harder for Synthesia to justify its pricing. If the quality gap shrinks, companies may prioritize lower costs over product differences, eroding Synthesia’s pricing power over time. Staying focused on internal communications, where quality is more easily traded off for cost, could make this risk more acute. By expanding into customer-facing content and external communications, Synthesia could potentially expand its addressable market and dilute this risk.

Summary

Synthesia is a provider of AI-generated video for enterprise use, offering a browser-based platform optimized for training, onboarding, and internal communications. Its core value lies in reducing production time and cost by replacing traditional video workflows with synthetic avatars and text-to-video tools. The company’s customer base has grown to 60K, including 60% of Fortune 100 companies. The company also faces increasing competition from startups and adjacent platforms offering potentially lower-cost or more flexible video tools, while limitations in avatar realism may restrict its expansion into higher-fidelity or customer-facing use cases. Sustaining growth will likely depend on Synthesia’s ability to maintain enterprise-grade trust and safety, expand its product scope beyond avatars, and deepen its international presence.

Thesis

Founding Story

Synthesia was founded in 2017 by Victor Riparbelli (CEO), Steffen Tjerrild (COO), Lasse Korsgaard, Matthias Niessner, and Professor Lourdes Agapito.

Product

Synthesia is a browser-based platform for creating professional videos without cameras, actors, or editing software. All functionality is available through the Synthesia studio environment.