CB Insights Generative AI Bible
CB Insights Generative AI Bible
CB Insights Generative AI Bible
Bible
The ultimate guide
to genAI disruption
2
Generative AI is a theme CB Insights covered before the
hype. We've consistently been ahead of the curve
1 2 3 4 5
3
And we're staying ahead today – so you can too
2023
Understand 25+ generative AI markets Stay on top of the landscape with research & transcripts
Large language model developers
Customer support operations
Text generation
Protein & drug design
Voice synthesis & cloning
+ more
CB Insights helps the world’s leading companies understand everything they need to know about 4
disruptive technologies — find out more about why our customers love us here.
Contents
Generative AI Bible
5
The
generative AI
boom
6
ERNEST HEMINGWAY, THE SUN ALSO RISES
7
ERNEST HEMINGWAY, THE SUN ALSO RISES
“Two ways…
Gradually and then suddenly”
8
2014
Gradually
Generative AI has been in the works for years
9
How did we get here? A recent timeline of select
events in the development of generative AI
2014 2016 2017 2018 2019
1 2 3 4 5
Generative adversarial WaveNet and audio New neural network Google AI releases BERT, OpenAI releases GPT-2,
networks (GANs) generation introduced architecture called the a leap in the ability of gaining attention for
introduced by Ian by DeepMind “Transformer” introduced machines to understand text generation
Goodfellow by Google researchers context in language capabilities
OpenAI releases GPT-3, “Deepfakes” become OpenAI releases Text-to-image models OpenAI launches GPT-
accelerating interest in widely known text-to-image model from Google, Midjourney, 3.5-based chatbot
language models DALL-E Stability AI, and OpenAI ChatGPT, unleashing
proliferate genAI boom
*Generative AI is artificial intelligence that can generate new content (text, code, images, audio, etc.). 10
GANs tap into the idea of “AI versus AI” — advancing
image generation dramatically
1
2014 2016 2018 2020 2022
Images from 2018 paper where DeepMind researchers trained GANs on a large-
scale dataset to create “BigGANs”
Image source: Large Scale GAN Training for High Fidelity Natural Image Synthesis 11
*”AI versus AI”: A breakthrough where two neural networks try to outsmart each other, creating and refining
synthetic outputs.
WaveNet produces synthetic audio, showcasing the
potential of generative models beyond images
2
2014 2016 2018 2020 2022
3
2014 2016 2018 2020 2022
175 +661%
The AI language model predicts a word based on not only the preceding words, but
also the succeeding ones (bidirectional understanding of context).
BERT is deeply bidirectional, OpenAI GPT is unidirectional, and ELMo is shallowly bidirectional.
November 2020
July 2020
Then suddenly
GenAI goes from experiment to everywhere
19
Models get bigger…
Image source: Imagen (Google), Midjourney, DALL-E 1 vs. DALL-E 2 (OpenAI), Stable Diffusion 22
Generative AI startups raise major funding to fuel growth
Deals worth $100M+ to generative AI startups in 2022
ChatGPT 5 days
Spotify 5 months
Dropbox 7 months
Facebook 10 months
Twitter 2 years
0 5 10 15 20 25 30
2,000
2,081
1,546
1,500
1,000
446
500
1 0 1 0 0 0 0 28
0
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3
2021 2022 2023
$14.
14000
0B Deals
168 170 200
$12.
12000
0B
154
$10.
10000
0B 150
$8.0B
8000 103
$6.0B
6000
64
$5.3B $10.0B 100
$4.0B
4000 $3.2B
50
$1.7B
$2.0B
2000
$0.7B
$1.0B $2.0B
$0.0B
0 0
Image source: Microsoft *A large language model is a deep learning algorithm that analyzes and produces text 28
by learning from extensive language data.
Meta introduces Llama, an open-source language model
Chegg attributes
declining growth
to ChatGPT
$2.7T
$3.1T
$2.5T
$2.7T
$1.6T
$2.0T
$1.4T
$1.9T
$1.0T
$1.2T
$0.8T
$1.1T
Current market cap
$0.6T
All-time high
$1.2T
$0.0T $0.5T $1.0T $1.5T $2.0T $2.5T $3.0T $3.5T
*GPUs = graphics processing units, which are used to run intensive AI applications. 34
Stack Overflow sees declining traffic and lays off
employees amid AI coding boom
37
Hundreds of startups
pile into genAI
38
Commercial genAI
applications are
proliferating
Generative AI Market Map
• Generative AI infrastructure
(foundational models, vector databases, etc.)
41
As investors look to ride the generative AI wave,
funding soars in 2023
Disclosed equity funding & deals (as of 9/30/2023)
Deals
$20. 0B 168 170 180
$18. 0B
154
160
$16. 0B
Funding 140
$14. 0B
103 $17.4B 120
$12. 0B
100
$10. 0B
$8.0B
64 80
60
$6.0B $5.3B
$4.0B $3.2B 40
$1.7B 20
$2.0B
$0.7B
$0.0B 0
$11.6B
Generative AI
infrastructure
30 deals
$5.5B
Cross-industry
generative applications
129 deals
$0.8B
Industry-specific
generative applications
48 deals
1 25
0. 9
Mega-rounds
0. 8
20 20
0. 7
0. 6 15
0. 5
9
0. 4 8 10
0. 3
0. 2
4 5
0. 1 1
0 0
2 $2.2B Canada
4 $1.4B Israel
6 $1.0B China
Source: CB Insights 45
Out of the 16 new AI unicorns in 2023 so far, 11 are
genAI companies
New AI unicorns ($1B+ valuation)
30 AI unicorns 2023 genAI unicorns
25
20
24
21
15
16
10
14 14 14
7
4 5 4
5
3 3 3 2 6 5
0
3 3
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3
2020 2021 2022 2023
Rest of world 64
Other
50
Generative AI is a new battleground for big tech, with
overlapping alliances and commitments into the billions
Generative AI companies with two or more big tech investors (as of 9/30/2023)
Big tech investors
Hugging Face
Adept
AI21 Labs
Anthropic
Company
Inflection AI
Inworld AI
OpenAI **
Runway
Synthesia 1,646
Typeface +370%
Series B
2 Inflection AI $1.3B 2023-06-29
$4.0B Microsoft, Nvidia Gates Frontier United States
Corporate Minority - V
3 Anthropic $1.25B 2023-09-25
N/A Amazon United States
Corporate Minority
5 Anthropic $400M 2023-02-03
$4.1B Google United States
Series+370%
D Amazon, Google Ventures, Salesforce Ventures, AMD, IBM Ventures, Intel
9 Hugging Face $235M 2023-08-23
$4.5B
NVentures Capital, Qualcomm Ventures
France
Series B
10 Imbue $200M 2023-09-07
$1.0B Nvidia Astera Institute United States
1 Deals 18
0. 9 17 16
0. 8
14
0. 7
12
0. 6
9 10
0. 5
7 8
0. 4
6
0. 3
3 4
0. 2
2
0. 1 2
0 0
1,646
+370%
Source: CB Insights — Advanced Search - Deals *Includes deals backed by Microsoft’s venture arm M12 57
...betting that generative AI could tip the competitive
scales in its favor for decades to come
Microsoft
Investment Thesis
Map – Generative AI
Source: CB Insights — Analyzing Microsoft’s generative AI strategy: How Microsoft is expanding past OpenAI 58
to transform the way we work
Microsoft’s investments in genAI help reverse slowing
Azure growth and contribute to 3 percentage point bump
Azure and other cloud services revenue growth (year-over-year) by quarter
70%
60% 62%
59% 59%
50%
50% 50% 51% 50%
47% 48%
46% 46%
40%
40%
30%
35% 29%
31%
27% 26%
20%
10%
0%
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1
FY 2020 FY 2021 FY 2022 FY 2023 FY 2024
Source: Microsoft 59
Alongside its Bard chatbot, Google puts billions toward
internal AI research & a range of AI startups
Google-backed equity deals to generative AI startups (as of 10/27/2023)
1,646
+370%
Source: CB Insights — Advanced Search - Deals *Includes investments by Google, Google Ventures, and Gradient 60
Ventures
Amazon and Google commit billions to LLM developer
Anthropic in battle with OpenAI-backer Microsoft
175 +661%
1,646
+370%
Amazon launches
$100M generative AI
accelerator, looking
to feed its cloud
computing business
AWS Generative AI
Accelerator Investment
Thesis Map
Source: CB Insights — Where the AWS Generative AI accelerator is placing its bets across 7 industries 62
So where is
genAI headed?
Generative AI will soon be
impossible to ignore as disruptive
applications spread
63
1. Race to dominate genAI infrastructure
64
Highest-valued genAI unicorns compete primarily at the
infrastructure layer
Most highly valued private generative AI companies (as of 09/30/2023)
Company 1 $29.0B
Company 2 $7.3B
Company 3 $4.5B
Company 4 $4.1B
Company 5 $4.0B
Company 6 $2.1B
Company 7 $1.8B
Company 8 $1.5B
Company 9 $1.5B
Company 10 $1.4B
$0M $5,000M $10,000M $15,000M $20,000M $25,000M $30,000M
Source: CB Insights — Generative AI — large language model developers market report - scorecard 67
Executive interest in AI has surged
$100K - $500K
$20K - $2M
$25K - $300K
$50K - $100K
$15K - $800K
$40K - $5M
Source: CB Insights — Software buyer interview transcripts and Analyst Briefing data 69
…but reducing costs & time to train are key priorities
Mosaic ML offers what's called One of the things that we're really
programmatic optimization, which is not trying to do is reduce the cost for a lot
so much on the hardware side of things, of our large language models and
but rather on the algorithmic side. Can training…What we liked about Mosaic
you find ways of optimizing the time it was it is a lot less expensive in terms
takes to get to a certain performance of the training models…Right now, I
bar? I think that's really what drove us to would project, just based on our usage,
evaluate MosaicML. In fact, it was pretty I think the initial spend was $15,000 per
much the main tool out there right now annum. I would expect next year to
that offers this. I don't think there's any probably be $200,000, $250,000. We're
other tool that really has this a very large organization and there's
programmatic optimization layer. been a lot of interest in MosaicML.
Source: CB Insights — Cohere software buyer interview transcript; Anthropic software buyer interview 71
transcript
Concern around AI
risks puts responsible
AI solutions in the
spotlight
These tools help
enterprises build and
deploy AI in an ethical 175 +661%
$1,300M
Source: CB Insights — Figures represent the latest disclosed revenue (based on company discussion or media sources).
73
All revenues are full-year 2023 ARR projections, expect for MosaicML (reported ARR at time of acquisition in June 2023).
Companies need to grow into big valuations and will come
under pressure to build real business models
Revenue multiples (as of 10/18/2023)
113x
100x
65x
28x
22x 21x
Company A Company B Company C Company D Company E Company F
Source: CB Insights — Figures represent the latest disclosed valuation divided by revenue (based on company discussion or
74
media sources). All revenues are full-year 2023 ARR projections, except for MosaicML (reported ARR at time of acquisition
in June 2023).
There’s no winner yet in foundational models
Strengths, I would say, the ethical We were considering obviously OpenAI, The first two things that came to my
considerations of privacy and bias, Cohere, Anthropic, and deepset…To be mind is, of course, OpenAI is still the
fairness…their model outperformed the honest, the reason that we actually market leader. That gives us the
other models, including GPT-3 and chose AI21 is because the interface is sense of comfort and we are confident
ChatGPT…In terms of weaknesses, the super easy to use for non-tech on this market-leading product…On the
specificity of the model output and the people…I actually tested Cohere pretty flip side of the open-source platforms
interestingness of the model output… I extensively…I think that their models that I just mentioned, the Hugging
think that other weakness also was in are really strong and also for some of Face and the Llama 2, we don't have
terms of speed and efficiency, like the creative work, I thought they were much faith and information about how
latency, and once you ask a question, doing slightly even better than even they are going to deal with our data.
how long does it take to fully respond. OpenAI when it comes to creative stuff, That's the key to the enterprise world.
like ad copy and marketing related If we are not certain, then I would
tasks. rather pass my roles to OpenAI.
Senior Manager of Data Science, Head of AI, Cloud, Data & AI Lead,
Model-as-a-service platform $100M+ funded technology startup Fortune 500 company
Open-source AI
movement gains steam
76
Need for AI model transparency and rapid innovation
is fueling the open-source AI movement
News mentions of open-source AI and related terms (as of 9/30/2023)
175 +661%
1,646
+370%
Notably, we recently announced a Azure AI is ushering in new AI is the flavor of the day. And
collaboration with Meta on Llama born-in-the-cloud, AI-first thanks to ChatGPT’s great
2-based AI implementations on workloads with the best selection launch, everyone has discovered
flagship smartphones and PCs of frontier and open models, this potential. We can expect
that will enable developers to including Meta's recent new changes by the day in the
create new and exciting genAI announcements supporting tech world. Just take yesterday,
applications using the AI Llama on Azure and Windows, Meta’s launch of Llama 2, which
capabilities of Snapdragon as well as OpenAI. will be available on Azure free of
platforms beginning in 2024. charge, including for
commercial 175 purposes.
+661%
Qualcomm CEO Cristiano Amon Microsoft CEO Satya Nadella Publicis CEO Arthur Sadoun
Q3’23 earnings call Q3’23 earnings call Q3’23 earnings call
*Some developers may offer open-source versions of their models *Excludes open-source developers that have not raised equity funding
but keep their core models proprietary
OpenAI’s developer API and the developer Meta's invention… it may be cheaper than
experience is definitely the best. It's really the OpenAI [model] because it's open-
managed, super clean APIs, well- source. Then we believe the performance
documented, and has the most of the Hugging Face and also the Llama 2
integrations. There was a big push toward is also comparable to the OpenAI [model].
using open-source models and then fine- Maybe just a little bit weaker than that, but
tuning them with your own data. That's maybe the overall…ROI is quite a good
still a big thing. We're actually evaluating deal.
that for some of the public data set stuff
because it's a lot cheaper, especially if you
add in more huge amounts of data versus
a giant OpenAI model.
Partner, Early-stage VC firm VP, Machine Learning, Fortune 500 company
$20M in FY’22
revenue (65x
multiple)
Source: CB Insights — Databricks acquired Mosaic ML for $1.3B. How do the valuations of other generative AI 81
companies compare?
Growing number of
vendors are developing
open-source tools to help
enterprises build and
deploy AI projects
Open-source AI
development market map 175 +661%
83
Tech vendors
supporting LLM
operations are
gaining traction
with enterprises
LLMOps market map
175 +661%
Source: CB Insights — The large language model operations (LLMOps) market map 84
*LLMOps refers to the end-to-end workflow that organizations employ to build, fine-tune, and deploy LLMs into production.
Execs are buying infrastructure tools for better training
data, observing performance of models, and more
We had, basically, messy data and we I led a small data science team that
needed a better way of providing, of created a lot of models in production. As
doing better training data for generative we scaled, our observability of our
content... We have a lot of machine machine learning models in production
learning models that are fueling us here. was limited, and we felt blind to issues
In both cases, it was the desire to have, or ways to improve the model once in
basically, a higher level of data hygiene production. We tried to build a solution
in our training data. in-house, which showed the difficulty of
the challenge.
Chief Product Officer, IT company Senior Manager, $10M+ funded data analytics
platform
Source: CB Insights — Snorkel AI software buyer interview transcript, Fiddler AI software buyer interview 85
transcript
New vendors are emerging for LLM fine-tuning &
customization
Leading LLM application development vendors by disclosed equity funding (as of 10/30/2023)
$200M 7 Funding
$180M
$176M 7
$160M 6
$140M
5
$120M
Deals
4
$100M
$109M 5
$80M 3
$60M
2
2
$40M 1
$44M 1
$20M
0 $10M
$0M 0
88
Execs are demanding their tech vendors keep up with
genAI advances and opportunities
VP, Technology at Publicly traded e-commerce Senior Design Engineer at Fortune 500
company company
Source: CB Insights — Sourcegraph software buyer interview transcript; SlashNext software buyer interview 89
transcript
Generative interfaces, like Anthropic’s AI assistant Claude,
lead in funding among cross-industry tools
Distribution of generative AI funding, Q3’22 — Q2’23
175 +661%
1,646
+370%
Source: CB Insights — Mutiny company profile - headcount; Jasper software buyer interview transcript 91
Watch for vendors to scramble to build defensible
moats in specialized areas
Source: CB Insights — Generative AI — legal case search & summarization; Virtual medical scribes & 92
summarization tools
3. Opportunity in vertical genAI
How generative AI is going to be used to…
Drive growth Improve customer experience Reduce costs & risk
Healthcare & • Copilots for doctors automate • AI companions address well-being & • GenAI drug discovery & design reduces
life sciences tedious tasks & improve EHR mental health time-to-market
documentation
• Synthetic patient data protects • Biomedical NLP supports clinical decision-
• De-noise radiology scans patient privacy making
Industry
Financial • GenAI assistants analyze & • GenAI chatbots simplify day-to-day • Synthetic training data improves financial
services & synthesize financial data at scale financial tasks models & ensures compliance
insurance • Automated underwriting decisions • Personalized interactions in • Pattern identification in unstructured
insurance sales process claims filings to minimize losses
Retail • LLM-powered search improves • Smarter, more relevant search • GenAI automates product catalogs
conversion
• Personalized avatars • Synthetic humans save on model costs
93
3. Opportunity in vertical genAI
Healthcare &
life sciences
94
HEALTHCARE & LIFE SCIENCES
Source: CB Insights — Virtual scribes & summarization tools market report – ESP, Generative AI copilots for 97
doctors have raised more than $240M
HEALTHCARE & LIFE SCIENCES
Source: CB Insights — Corti Analyst Briefing; Virtual scribes & summarization tools market report 98
HEALTHCARE & LIFE SCIENCES
175 +661%
1,646
+370%
VP, Machine Learning, Fortune 500 company Sr. Research Engineer, Fortune 500 company
Source: CB Insights — OpenAI software buyer interview transcript, Cohere software buyer interview transcript 100
HEALTHCARE & LIFE SCIENCES
Advantage is with, for example, John Snow Labs, it's a very sort
of clinically trained model... It's not just trained on wiki pages or
like general text. In that sense, I think it's much better…in terms
of entity recognition and things like that.
But the limitations, I would think, are these models are not
getting trained on the volume of data anywhere as close to
what ChatGPT is trained on… It [OpenAI deployment] was pretty
minimal overhead… the goal… is to essentially enable the use
of ML tools that are available from the Azure subscription at
the Enterprise level…
175 +661%
1,646
+370%
1,646
+370%
Financial services
& insurance
105
FINANCIAL SERVICES & INSURANCE
In financial services,
generative AI will
automate tasks and
transform how
organizations
use financial data
175 +661%
1,646
+370%
Source: CB Insights — business relationships and news mentions; company websites and press releases 108
FINANCIAL SERVICES & INSURANCE
Source: CB Insights — Kasisto Analyst Briefing; Cognaize software buyer interview transcript 111
3. Opportunity in vertical genAI
Retail
112
RETAIL
Source: Company announcements; CB Insights — AX Semantics software buyer interview transcript 115
RETAIL
Source: CB Insights — Generative AI – e-commerce search market report; Algolia software buyer interview 116
transcript
RETAIL
Source: CB Insights — Generative AI – synthetic humans & fashion design market report 117
RETAIL
175 +661%
1,646
+370%
119
We identified the 50
most promising
genAI startups
Generative AI 50
CB Insights helps the world’s leading companies understand everything they need to know about disruptive 121
technologies — find out more about why our customers love us here.