Responsible AI by Design | Alex LaPlante, RBC
Alex LaPlante is VP of Cash Management Technology at RBC, former Interim Head of Borealis AI, co-author in Harvard Business Review, and a member of Canada's federal AI Strategy Task Force. In this ...
In 2014, the largest tech companies spent $44 billion on capital investments. By 2024, that number passed $200 billion — almost all of it tied to AI. This isn’t an innovation budget. This is an arms race.
We follow the money to understand who’s investing, where it’s going, and what the economics of AI infrastructure really look like. From hyperscaler CapEx to the GPU supply chain, the financial signals tell a clear story about where AI is heading.
Episode 2 digs into the financial engine powering the AI revolution. The scale of investment is historically unprecedented — and it raises critical questions about sustainability, returns, and who ultimately controls the infrastructure of intelligence.
The biggest story in AI economics isn’t the models — it’s the infrastructure spend. The largest technology companies have collectively increased their capital expenditures from $44 billion in 2014 to over $200 billion by 2024. Nearly all of that incremental spending is AI-related: data centers, GPUs, networking, and cooling. This is the largest infrastructure buildout since the fiber-optic boom of the late 1990s.
Demand for AI accelerators — particularly NVIDIA GPUs — continues to massively outpace supply. This has created a cascading set of effects: companies are hoarding chips, delivery timelines stretch to months or quarters, and the GPU supply chain has become a strategic asset. We explore how this bottleneck shapes who can and can’t compete in AI.
AWS, Microsoft Azure, and Google Cloud have emerged as the de facto infrastructure layer for AI. Their massive scale advantages in compute, data, and distribution mean that most AI workloads — from training to inference — run on their platforms. We discuss what this concentration of power means for the industry.
For all the investment, revenue from AI applications is still catching up. Enterprise adoption is growing, but many organizations are still in pilot mode. AI startups are raising record rounds, yet many lack proven revenue models. We examine the gap between investment and returns — and when (or whether) the economics will balance out.
In 2014, the largest tech companies spent $44B on capex; by 2024 that number exceeded $200B — almost all tied to AI.
GPU demand continues to outstrip supply, creating bottlenecks and reshaping the semiconductor industry.
Cloud hyperscalers (AWS, Azure, GCP) have become the new AI infrastructure layer, concentrating power and investment.
Revenue is still lagging behind investment — the ROI question remains the elephant in the room for enterprise AI.
AI startups are raising record amounts of capital, but sustainable revenue models are still maturing.
Manav Gupta [00:01]
In 2014, the largest technology companies in the world spent about $44 billion a year on capital investments. By 2024, that number passed well over $200 billion. And almost all of the increase had one name attached to it. You guessed it, AI. This isn’t incremental spending at all. This is not an innovation bucket.
Manav Gupta [00:23]
This is the largest concentrated investment of capital in corporate history. Seven companies, the so-called Magnificent Seven, are now spending more on AI infrastructure than the entire global energy sector. They’re pouring money into chips, into data centers, custom silicon, heck, even nuclear plants. And this is before the business models are proven, before margins stabilize, and before regulators catch up. And here’s the uncomfortable question no one likes to ask. What if it doesn’t pay off?
Manav Gupta [00:58]
In this episode we are going to follow the money, not the hype, not the demos, the actual dollars. Where are they going? Why they’re going? where they’re going? and What has to be true for this multi trillion dollar gambit to work? Because when capital moves this fast and this aggressively, it’s not about optimism.
Manav Gupta [01:15]
It’s about pressure and pressure always reveals the real story. Welcome to the podcast. This is Ship AI episode two, follow the money. Let’s go! All right, let’s start by looking at the big bet. So let’s start with this picture.
Manav Gupta [01:38]
So the US macro level data shows that the AI related investments have surged in the last few years. You can see that squiggly graph at the top, investments, capital investments into the technology sector that’s related to computers, communications, equipment and semiconductors far outpaces anything else since 2020. There was a little blip and then of course it took off. Whereas if you look at the energy sector, it’s barely grown. Whereas the rest of the non-energy, excluding telecoms, has remained largely flat. This is not a normal tech cycle.
Manav Gupta [02:13]
This is industrial scale reallocation of capital. AI is no longer riding on top of the economy anymore. It is the fastest growing sector inside it. Some have argued it’s a sector unto itself. When the production curves diverge like this, innovation stops being optional. It becomes structured.
Manav Gupta [02:37]
It is part of the cycle. This is the physical foundation that has made AI possible. All those models and data hungry planet scale models possible. Here’s another graphic just to substantiate the previous point. Look at how the data centers are bucking the trend in private non-residential structures. So in the 12 months ending on June 2025, data center construction has soared up 28 % and up 331 % from 2021.
Manav Gupta [03:05]
In stark comparison, the private sector spending on overall non-residential construction fell 4 % over the course of the last 12 months. That should tell you that all the money that’s being invested from in all the capital investment, bulk of that investment is going into building these data centers. Let’s go deeper. The Magnificent 7 that we talked about previously, once again, that’s the Amazon, Apple, Google, Nera, Microsoft, and Varia Tesla. Their capital investment in the last few years is 63 % year-on-year growth in 2024 alone. 15 % of their revenues are going towards capital expenditure.
Manav Gupta [03:47]
This is infrastructure spending at a scale we have not seen since the mainframe era from IBM in 1969. Let’s go deeper. This chart from JP Morgan clearly shows the divergence. The Magnificent 7’s CapEx plus R &D has grown 6x, not 6%, not 60%, 6x since 2020. The rest of the large market cap essentially flat as you can see from that squiggly yellow line. In fact, the Magnificent 7 capital investment now exceeds that of the entire energy sector.
Manav Gupta [04:27]
Again, this is complete corporate capital reallocation. So another way of interpreting this chart is AI progress is being decided by who can spend. That’s the critical element. What these AI companies are now doing is they’re building economic mode around the models. The progress is going to be decided by who can spend building the data center, acquiring the energy, acquiring the data, acquiring the GPUs to build the models and not just about who can invest. Again, let’s go even deeper.
Manav Gupta [05:04]
Let’s put this in historical context. NVIDIA’s data center revenues as a share of market-wide capital spending are approaching 15%. That is at the same level where IBM’s peak revenues were in 1969. That’s the peak of the main frame era and the combination of Cisco, Luce and Nortel peak revenues in 2000. So the first chart on this page asks the question, Is today’s AI spending unprecedented? I mean, of course, the answer is yes.
Manav Gupta [05:35]
Now, AI’s data center spending now represents a larger share of the total capital market than IBM at its peak. right? That we’ve already established. Now let’s look at where is the money being spent. As I said previously, across 2024, the total capital exceeded over $200 billion in a single year. What’s interesting to note is half of that money goes directly to the GPUs and specialized chips.
Manav Gupta [06:00]
And here is the critical imbalance. Companies have to spend more than twice, 2.25x, in training the models as compared to running them. So training of these models, these hyperscale planet scale models, this is a one-time cost. Inference is what scales with revenue. And for AI to become sustainable and profitable, this ratio has to flip.
Manav Gupta [06:28]
Here’s another view of where the VC funding is happening, where the seed funding is going. So this is a chart from Y Combinator mapping out where all the startups are in AI in the dark blue or others. And you can see since the launch of ChatGPT in 2022, we are beginning to see a greater and greater concentration of investment in AI sector alone. Now, it’s not just about the investments. There is also revenue velocity that these companies are seeing, by the way. And then I’ll show you some yet another view on some of the investments that are being made in form of a circular loop.
Manav Gupta [07:08]
But before we get there, let’s look at the economy. The fastest revenue ramp in software history has been exhibited in the last three years by OpenAI and Anthropic. OpenAI grew as much as 65x, 65 times in a span of three years. They went from approximately $200 million in annual rate of revenue in 2023 to projected $13 billion by August of 2025. Anthropic grew by 80 times, going from $87 million in ARR to as much as $7 billion projected for 2025. In comparison, Salesforce took 10 plus years to reach a goal of $10 billion in ARR.
Manav Gupta [07:51]
AI companies are hitting that in about 36 months. So, We talked about investments, we certainly talked about the revenue velocity, but let’s also understand how these companies are coming up with newer and newer ways of finding um revenue. Okay, so I’m going to show you a visualization that hopefully makes that point a bit more clear. So. So On your screen, what you see is two companies right now, OpenAI and NVIDIA. OpenAI hit $500 billion in valuation in October, 2025.
Manav Gupta [08:39]
At this point, it became the world’s most valuable company surpassing SpaceX. Reportedly, they are now seeking $830 billion in their next round. The other big bubble that you see in green is NVIDIA market cap of 4.5 trillion with a T as of January 2026. They briefly hit $5 trillion in October of 2025 when DeepSeek knocked $600 billion in a single day. We’ll talk about DeepSeek separately.
Manav Gupta [09:11]
These two companies sit at the center of an intricate web where capital flows back and forth as revenue. And then I’ll show you how. So let’s begin. Here’s step one. Microsoft takes a 27 to 30 % stake in OpenAI because you know again the funding every funding round keeps diluting their stake. So the total investment since since 2019 that they’ve made is about 13.5 billion dollars.
Manav Gupta [09:34]
Here is where the loop comes in. So the money that Microsoft invests into OpenAI. OpenAI uses 80 % of its training and inferencing infrastructure is spent on Microsoft Azure Cloud. So 80 % of its compute is being spent on Microsoft Azure. What does that mean? So Microsoft spends billions, invests billions into OpenAI.
Manav Gupta [10:03]
It gets not only that revenue back as cloud revenue, but it also gains its investment increases as the OpenAI valuation increases. So the money once again goes in circles. Let’s now talk about NVIDIA and the OpenAI look. September of 2025, NVIDIA announces a $100 billion commitment to OpenAI. In return, OpenAI agrees that they’re going to deploy 10 gigawatts worth of NVIDIA chips into their GPUs into their data centers. So again, the capital goes in, revenue comes out.
Manav Gupta [10:44]
So NVIDIA gets the return back as revenue. And of course, as the OpenAI evaluation increases, that 100 billion commitment increases same for. So same scenario happening here. In the next step, you see a tripartite agreement. So now comes Oracle into the mix. So now it’s a three-party agreement between OpenAI, Oracle, and NVIDIA.
Manav Gupta [11:08]
So in January of 2025, Project Stargates announced at the White House. It is a $500 billion commitment over four years, largest AI project ever. Oracle then signs, OpenAI rather, signs a $300 billion deal with Oracle for cloud infrastructure. Oracle then commits 40 billion dollars on Nvidia GPU for their Aveline Texas data center. That’s basically 400,000 Nvidia GPUs for a 15 year lease. So the triangle then closes.
Manav Gupta [11:47]
money flows to Oracle, money flows to OpenAI, flows to Oracle back to Nvidia. That’s the third loop in there. Then AMD joins the game. So in October, of 2025 AMD announces a deal with OpenAI. AMD offers warrants for 160 million shares about 10 % of partnership of AMD and the strike price is one cent per share. In exchange, OpenAI commits to six gigawatts of AMD GPU deployment.
Manav Gupta [12:18]
Of course, the scenario here is that the models that OpenAI is building and will build need so many GPUs that Nvidia alone may not be able to fulfill that demand. So therefore they are divesting their business model. So they’re going to another chip manufacturer, AMD in this scenario. Now, because of this investment and this commitment from OpenAI that they are going to buy six gigawatts worth of AMD GPU. AMD then expects $100 billion in revenue over four years from this deal. As soon as the date was announced, The AMD stock jumped 35 % adding $80 billion to the market cap.
Manav Gupta [12:59]
Even NVIDIA competitor is now in the circular economy. Everyone’s buying equity into their own customers. That’s another loop. Now not to be outdone, here comes Amazon and Google with Anthropic. Amazon and Google, they both decide to invest in Anthropic. So the idea is that even though Google is training their own Gemini series of AI models, they are hedging their bets.
Manav Gupta [13:29]
So Google now owns 14 % of Anthropic as revealed in code filings. And then AWS also enters into a primary partner for Anthropic’s primary cloud provider. Google then gave access to Anthropic to 1 million TPUs, which is tens of billions, undisclosed, but tens of billion dollars in revenue. So now in this scenario, we have two competitors, Google and Amazon potentially. They’re both getting cloud business in return, right? And they are both investing in Anthropic.
Manav Gupta [14:09]
Anthropic in return is using services from cloud from both cloud providers. So they’re both getting their money back in terms of the services being consumed. So they’re getting the one layer of return on that. Plus as the valuation of Anthropic increases, the the return on their investment increases as well. So you can think of this as a pincer attack on OpenAI, but yet another form of circular funding. Next come the startup ecosystem.
Manav Gupta [14:39]
So now you have NVIDIA investing into XAI, Mistral, Figure AI, and OpenAI then investing into smaller startups such as Harvey, Cursor, and Ambience. So what’s the idea here? So NVIDIA’s VC R is everywhere. X.AI is up from a valuation of 2 billion dollars in January oh of 2026 to as much as 230 billion dollars of valuation. Mistral, they participated into a 1.7 billion euro funding round.
Manav Gupta [15:05]
Figure AI 50 million seed, it’s now valued at 39 billion dollars. Each one of these startups in turn is going to buy NVIDIA chips. The OpenAI startup fund in turn they have deployed $175 million across 37 plus companies such as Harvey as an example. And to be clear, OpenAI invested only $5 million in seed funding in 2022 in Harvey AI. It’s now valued at $2 billion valuation. AnySphere, now known as Cursor or known for Cursor the product.
Manav Gupta [15:41]
They invested $8 million in a seed funding now. It’s now worth $29 billion market cap. All of them are using OpenAI’s APIs. So now you can think of this as a downstream investment. So all of these startups, they got funding from OpenAI. OpenAI’s investment increases as the valuation of those companies increases.
Manav Gupta [16:02]
But OpenAI also gets money back from those companies as those companies are using APIs to build their business model. Let’s talk about the final step. I jokingly call this the house of cards, but here is the full network. with some concentration stats. Really at the heart of it all is just eight or nine companies that control the entire AI ecosystem. 53 % and there is also some concentration risk by the way.
Manav Gupta [16:34]
53 % of NVIDIA’s data center revenue comes from just three mystery clients that they don’t disclose. In fact, two of their customers are responsible for 39 % of their quarterly revenue. If any one of them stumbles, the world’s most valuable company is in trouble. There is certainly we talked about the dependency on S &P 500 of the Magnificent 7 because the Magnificent 7, have driven 42 % of returns in 2025. The tech sector represented by Magnificent 7, it represents 63 % of total gains in S &P 500. So we can call this circular economy.
Manav Gupta [17:19]
Some people have gone as far as to say that there is a concentration risk, potentially the beaver, the bubble, but there is certainly enough stressors here to give us a pause that there is something going on here beyond just the natural routes of monetizing their services. Okay, let’s take a step back. So we talked about investments being made. new economies and new models that are investing. We talked about the revenue velocity. But are there any projections of where, how big is this market?
Manav Gupta [17:51]
How big could this market potentially be? So not to be left behind ah consulting firms such as ah in this scenario, Morgan Stanley, they have come up with some pretty bold assertions of the total addressable market for generative AI. And really when they talk about AI, predominantly what they’re talking about is generative AI. So Morgan Stanley’s projection is as much as $4 trillion, which is about 15 % of US GDP, as much as the entire German GDP. And they’ve broken it down into just over 1.3 trillion three in software and IT, another 1.1 trillion in professional services.
Manav Gupta [18:27]
And then you see the further breakdown into things such as customer operations, R &D, supply chain, et cetera. And the key assumptions behind these projections are that enterprises can expect to get 15 as much as 20 % in productivity gains, that almost 50 % of the tasks that are done globally are augmentable through AI, and that enterprises are going to have enterprise-wide adoption in AI. I mean, that’s where this is coming from. And to be clear, they’re not…
Manav Gupta [19:04]
they’re not that far off in at least the early productivity signals that most companies have achieved. So here is some examples from various research studies such as MIT, Harvard / BCG, some studies from Stanford, the AI assisted coding studies done by whether it is Microsoft, GitHub Copilot, PayPal, et cetera. Across the board, we originally observed is that for AI assisted coding, some of these firms were reporting as much as 55 % efficiency gains, productivity gains that they could write code faster, write documentation faster. PayPal CEO is famously on record that their developers were 30 % more productive. Now, by the way, caveat here that these are all early signals. We’ll get into a lot more detail upon company claims versus academic studies.
Manav Gupta [19:59]
But this is what the promised land is that with all those investments with the new economic models that are emerging and these early productivity signals potentially there is enough revenue to be had potentially there is a promised land that these companies are going to go to make some money. All righty. Okay. Now comes act two. So we talked about the models. Let’s get into the machine behind this AI.
Manav Gupta [20:29]
Let’s get a little bit more deeper. So let’s have a look at NVIDIA’s revenue chart. So NVIDIA had revenue of $57 billion in the quarter ending October of 2025 with a nearly 63 % growth. This brings the company’s revenue in the last 12 months to $187 billion. Again, a 65.2 % year-on-year growth.
Manav Gupta [20:55]
This is fantastic. This continually shows how when there is a gold rush, the company selling shovels and pickaxes is the one that’s going to make the most money. Now let’s look at how the big tech is monetizing AI. And I’ve just chosen these as some examples, but I hope that it drives the point home. Microsoft, So what most companies have come up with is either a seed based licensing, some consumption APIs or some consumer consumptions. And then you can see Microsoft Copilot starts with $30 per user per month.
Manav Gupta [21:30]
ChatGPT has some similar costs. There is alternative consumer subscriptions as well. And then there is weighted pricing. These are just the standard pricing there. You can get pro pricing or unlimited et cetera plans for $200 a month as well. Now that new models have come up.
Manav Gupta [21:49]
On the cloud side, there is the Azure OpenAI service or the Amazon Bedrock, Google Vertex API. These are all API token based pricing. But really what’s interesting is that as the models become more and more well, they become more and more accurate or bigger and bigger, uh they are under constant market pressure of reduction in cost in the token pricing as well. So we’ll examine that in a bit more detail. Beyond the CAPEX, there is a VC ecosystem that has invested over 200 billion dollars in the last four years in the various seed rounds, um just in AI companies. And what’s interesting here is that there is so much I don’t want to call it hype, but so much interest, nobody wants to be left behind.
Manav Gupta [22:47]
But what’s really interesting for all of these companies is that of course, public market demands profitability. It’s creating a funding cliff. There is a gap between private valuation and the public market signals. So there might be some potential corrections for at least some of the companies that that’s coming. But in this arms race, one thing is absolutely clear. Beyond the GPU monopoly, there are new silicon that is emerging that’s going to reshape economics.
Manav Gupta [23:16]
So, so far all funding roads, all revenue leads to NVIDIA. But what we’re now beginning to see is a race for custom silicon. So Google in particular, they saw this very early on in 2011, 2012, they started working on Google TPUs for some type of workloads, such as the inferencing workloads, they can be highly performant compared to NVIDIA at a lower price. Amazon has come up with Tranium, which is 40 % cost reduction. So does Microsoft and Meta. So we’re beginning to see projections of as much as $200 billion of revenue in just custom silicon alone.
Manav Gupta [23:54]
There is some new economics emerging around deployment of AI on the edge, on cheaper hardware, on commodity infrastructure. And then contrast all of that with what’s going on with the supply chain reality. So the reality today is that there was a point in time where there was a lead time of 55 weeks 53 weeks to get our Nvidia to get somebody’s hands on Nvidia h100 What we’re also beginning to see is emerging competition from new well some of the existing vendors like Intel with their Gaudi 3 and AMD, which is 40 % cheaper So we’ll begin to see a lot more pressure on NVIDIA from these new entrants. If that’s not enough, but have a look at what’s happening to the data center build time. So this is an example from XAI Colossus, where they built a fully functional data center in 2024 in 122 days. This is a, mind you, this is a data center which is 750,000 square feet in size.
Manav Gupta [25:02]
which is approximately 418 homes in the US. The average half built house duration is 122 days. And on their website, they proudly claim that they were told that it was going to take 24 months to build this. And they decided to take the project in their own hands, questioned everything, removed what was necessary, and they accomplished their goal in four months. The reason I highlight this is that speed is becoming a competitive advantage. These AI megaliths, they are questioning everything and they’re removing any unnecessary step in the way.
Manav Gupta [25:37]
To keep this in perspective, this is a 200,000 NVIDIA GPU super cluster. It’s really a supercomputer. From rack to training of Grok, it took them just 19 days with a total target power capacity of 2 gigabits. This clearly sets a new standard for infrastructure deployment. So all of that money that you see that’s being invested into building the data centers, that’s how that money is being used. But I wish that the building of the data centers was linear.
Manav Gupta [26:17]
There is certainly challenges. What do data centers need? They need two big credible things. They need energy. Well, apart from space, they need energy. And they need water.
Manav Gupta [26:29]
So let’s just talk about the energy constraint for the time being. So what’s going on is that the AI’s appetite for power is reshaping the grid. So data centers have gone to be using as much as 1.5 % of the global electricity, which is a 12 % growth. So it’s going faster, four times faster than the total electricity growth that we can build in the grid in the first place. Geographically 45 % of the data center electricity is consumed in the US, followed by China and then Europe.
Manav Gupta [26:58]
So what have the big tech starting to do? They’re actually now trying to build their own nuclear power plants. Right, So Microsoft, Google, Amazon and Meta, they have made significant announcements and certainly investments into building their nuclear power plants to take into account predictable pricing grid independence so that they’re going to be able to have high energy density. Here is some more details on it. So Microsoft as an example, famously it’s a $1.6 billion project.
Manav Gupta [27:36]
They’ve gotten a billion of a loan from Department of Energy. They’re going to be building an 835 megawatt nuclear plant. Not to be left behind, Amazon, $20 billion investment into the Talon Susquehanna PPA project. And again, you get the idea. these, AI is not, is no longer just about data. It’s no longer just about GPU.
Manav Gupta [28:09]
It is now about a whole new risk factor, which is around nuclear regulatory rights, water tribunals, and the grid interconnection cues. Not talked about enough yet is the hidden cost of AI, which is water. Five million gallons per day is what a large data center requires per day. By 2030, the projection is that they’re going to require as much as 317 billion gallons of water per year globally. To put that in perspective, that’s the water that’s used by the state of Texas. So The new risk vectors when it comes to increasing AI and efficiency of AI, that’s going to require dealing with water rights tribunals, location arbitrage, nuclear regulatory delays, in addition to all the other challenges that the Magnificent 7 are dealing with.
Manav Gupta [29:04]
Okay, let’s take a step back. So we talked about revenue and business models. We talked about how they’re spending the money. Maybe there is one other perspective that we should talk about, which is the hidden costs that perhaps are not quite that evident in the first place. Okay, let’s just talk about the training versus inferencing economics and I’ll show you a bit more detailed chart to drive home this point. So in 2027, if I look at the CapEx split, 27 billion was spent in training versus $12 billion in inferencing.
Manav Gupta [29:38]
As I said previously, 2.25x. And again, just to recap, for AI to profit, this ratio has to flip because training is a one-time investment, inferencing scales up with revenue. But have a look at what’s going on here. This should give you a really an interesting clue on what’s happening with these models. So this is a chart from APOC AI.
Manav Gupta [30:06]
On the X axis is the publication date. On the Y axis is cost in dollar. And again, it’s Y axis is the log scale. So that should give you an idea of how, when we build these planet scale models, how much money is required to train these models. So what is estimated is…
Manav Gupta [30:25]
the cost breakdown to develop key frontier models such as GPT-4 and Gemini Ultra, including R &D and staff costs. So this is not compute alone. So hardware costs, approximately, we can estimate as about 47 to maybe as high as 67%. R &D costs are quite substantial at about 29 to as much as 50%. And the remaining 2 to 6 % goes towards energy consumption. So what does this tell you?
Manav Gupta [30:52]
What this tells you is that the largest training runs will cost more than a billion dollars by 2027. Which means that the frontier models of tomorrow that are being trained, they will be too expensive for most organizations except you guessed it, the Magnificent Seven, the most well-funded organization. So all that money that’s being invested, that economic moat that I talked about, the investments in data centers. They are being built so that these companies can continue to build the biggest, the best model that there is at planetary scale with all of humanity’s data and all of humanity’s hopefully knowledge and build beyond post beyond transformer architecture models and build a financial mode that nobody else is going to go to surpass. So that inference versus training cost, that’s the first hidden cost. Mind you, there is also a second element to this hidden cost, which is the cost of the talent.
Manav Gupta [31:52]
To put it in perspective, for every dollar that’s spent in the industry on NVIDIA chips, the companies are spending 80 cents on talent. Personally, I think it’s maybe perhaps even higher. What’s really interesting is the top talent in AI, the top researchers, they are now commanding what top developers used to command in the Silicon Valley at the height of Comp Sci development, .com type of booms and so on and so forth. A typical entry level data scientist, somewhere between 300k to 500k. At the top firms, the best of the best MML engineers can command as much as 800k.
Manav Gupta [32:32]
Top AI researchers, in fact, in some cases, the AI researchers have been personally offered multi-million, multi-hundred million dollar bonuses. These are the stars that are really shaping the industry. And then if you look at the burnout in the talent, that is happening pretty quickly as well, because talent as a percentage of bond rate, about 20 % of the uh investment in FAANG companies um is being spent on the talent. And then you can see that there is certainly an ongoing crisis in trying to retain this talent. It is not unheard of that talent is being pushed from one company to the other at astronomical prices and contracts. All right.
Manav Gupta [33:25]
There is two new taxes that are emerging. So the previous two were well understood and well known, but There is a new tax that potentially was not accounted for historically. So prior to GPT-3, the idea was scrape everything now, pay nothing. Essentially now the model is being shifted. For example, in GPT-5 they will have to license everything because the consumers, the content creators, the regulators are now waking up. Regulation is now catching up.
Manav Gupta [33:58]
So rather than just scraping anything and everything and not paying and paying nothing, they now have to license everything potentially and pay perpetually. Here’s some examples. So Reddit entered into a $130 million a year deal um with Google and OpenAI. And the idea here is that because it’s human generated data, which is quite highly labeled, that data is highly valuable. And rather than Google and OpenAI just scrubbing it, they’re now having to pay Reddit for that data. And the strategic implication here is that it’s going to be number one cited source in AI models more than three times than Wikipedia.
Manav Gupta [34:34]
Similar things apply for other news sources that you see here like Newscore, New York Times, and so on and so forth. So really the annual estimated cost for data that potentially the startups may have not thought about previously is going to be over $800 million a year. My point being that training on free internet data is over. New entrants cannot afford the high quality web. This actually works in the favor of the Magnificent 7 and the builder of these planetary scale models that have started early on because they now have this incumbency mode that new startups cannot pay. So this is a data tax that locks in the dominant players of the industry.
Manav Gupta [35:22]
Here comes the next potentially historically hidden cost that is now going to emerge and make it even harder for newcomers into this segment is as regulation catches up, the regulatory costs at least for enterprises in many cases they exceed the hardware cost for enterprise AI deployments. So again, just to put it in perspective, if you think about the European Union’s AI Act implementation in total terms of R &D, in terms of humans, in terms of providing all the risk documents and disclosures to the regulators, you’re looking at at least about five to $10 million if you’re lucky, potentially higher. For the patchwork of regulations from various US states, you’re looking at somewhere between $2- $5 million per state. If it happens to be a global company that’s working in Europe, they have to comply to GDPR. That’s another $3 to $7 million a year. Therefore, for an enterprise, the total compliance burden could be as high as an additional 25 % on AI budgets.
Manav Gupta [36:29]
That’s purely on compliance. Add to it… the research that every organization has to do on safety, that ensuring that the model is not leaking any PII data, the model is not providing any guidance that it should not provide, that it is not um hallucinating as an example, that it’s not providing hateful information and so on and so forth. And then, and by the way, there is two elements to that cost.
Manav Gupta [37:02]
There is the budget spent on safety and alignment of the model. So this is in the model training phase itself. Reportedly, Anthropic is spending as much as 30 % of the budget on model safety and alignment, followed closely by OpenAI at 25%, DeepMind from Google at 20%. So that’s $15 billion a year at the purely at the model training level itself. And then there is a second level of safety research investment that enterprises have to do when they’re adopting this within the enterprise. The flip side of this is what these companies are now able to do is they are now approaching what they’re calling proactive compliance ROI, which is that hey, our models are the box, already are compliant to certain type of regulations or we provide you these type of guardrails, therefore we can help you achieve the compliance quickly.
Manav Gupta [37:53]
And then there are some stats here in terms of how they can achieve productive compliance. The point here being that it’s both a cost, it’s a hidden cost to some extent, but they’re now trying to turn it into comparative differentiation by saying that compliance is no longer just a regulatory overhead. But if they were to build this compliance and safety into the model itself, they can use that as a competitive mode. All right, so we covered a lot. Let’s now talk about the crisis within the business models. Let’s spend a little bit of time understanding the margin crisis.
Manav Gupta [38:31]
So as the world moves away from the traditional SaaS, which was 80 % to 90 % of gross margin to, I’m going to call them for lack of a better term, AI supernovas, which are venture capital subsidized. They only have about 25 % gross margin because what’s happening now, what’s happening now is that as historically the SaaS companies, they would sell access to a tool. So if I’m Salesforce, if I am ServiceNow, if I am any other SaaS company, I’m selling access to a tool on a per seat, per month license. What is now happening is a transition, slow but surely, a transition to service as a software, where what the companies, what enterprises want is an outcome, not just a seat license. Can you give me an AI agent for HR? That’s going to help me shift some costs around uh running my HR operations.
Manav Gupta [39:33]
So that’s on an outcome basis per resolution. And what’s baked into that cost? is I’m going to say compression of the margin because of the expectation here is that when I’m getting that outcome behind the scenes that SAS that offering that capability now has to make requests to an AI model. Hopefully one of the frontier models. Which means that every query every query that’s being run it has two costs. It has certainly the compute cost.
Manav Gupta [40:09]
And then it has the token cost. And then as soon as agents get involved, there is an explosion of in token consumption, in some cases, 10 times, maybe even 100 times, depending upon how complex the task is. All right. So the margin is now compressing. So if I’m a traditional SaaS vendor, my margins are now beginning to be squeezed. And there is some evidence that the math doesn’t quite work.
Manav Gupta [40:38]
So as an example, the GitHub copilot, famously there was a study that they are losing $20 per user in early 2023. I’m sure they’ll figure out some ways of getting around it. Replet, they had sub 10 % gross margin at peak. Again, they lifted the pricing from 20 to 30 % up and so on and so forth. So the implication here being that if the AI companies continue to have lower margins than SaaS, maybe Wall Street is going to reward them with a lower multiple. All right, so that’s one of the crisis.
Manav Gupta [41:17]
I talked about this one a little bit, but it’s worthwhile revisiting this as well, which is the AI cost paradox. So what’s interesting is that as we continue to build bigger and bigger models with better and better architectures on hardware that is more and more expensive, so certainly the training costs tend to go higher and higher. But the inferencing costs to run the same model is declining. The adoption is accelerating. So that’s yet another curve. But the expectation of cost per thousand tokens is declining.
Manav Gupta [41:51]
So there is something broken in this model tension, which is that building the frontier model is exponentially more expensive. But selling across the access cost, the selling the cost of accesses is declining. And we’ll see this in detail graphs. So here’s an example, like performance per dollar. So this is purely onto the infrastructure side. So if you look at whether it is NVIDIA chips as an example, the performance tends to get better and better, which means the cost of inferencing for NVIDIA, sorry, the cost of inferencing for GPT 3.5, if that was $20 in 2022, that’s going
Manav Gupta [42:28]
to cost OpenAI only seven cents in 2024, giving a return of 280 times. With some of the efficiencies achieved in the models, like five, three with 3.8 billion parameter model can outperform historically what was Palm 540 billion parameter model, which is 142 times improvement. So what we’re beginning to see is the performance per dollar of the hardware is improving. The performance, the algorithmic improvement in the model, which means that as soon as you build a model and you start then monetizing the model, you’re basically now into an arms race that when the next model comes out, that’s going to be cheaper to run. Your running inferencing costs are already higher.
Manav Gupta [43:32]
So now you’re running against a, well, you’re really swimming upstream. Then finally, the other thing that’s going on is that the interesting tension that was going on between open and closed source models, closed souls are beginning to close parity. There was a point where the, on MMLU benchmark, the open source models had as much as a 8%- 10 % difference. Well, now with open source models, that gap has shrunk to under 2%. So open source models are reaching parity as well. Therefore, that’s yet another tension in the business model for these um mega AI model, frontier model providers.
Manav Gupta [44:04]
And of course, who can forget the deep seek AI Sputnik moment. So what happened on January 27, 2025 is NVIDIA suffered the largest single day drop in the US market history, wiping, losing nearly $600 billion in their market cap. This was triggered by a Chinese AI company that most people had not heard of. I mean, had not heard of at that point. We have all since heard of DeepSeek, a Guangzhou based startup ah from our street. They released an open source, what they call a reasoning model, which matched open AI frontier models capabilities at a fraction of the cost.
Manav Gupta [44:48]
Famously, Mark Andreessen called it the AI’s Sputnik moment. And the comparison is really quite apt. This was like how Soviet Union shocked America in 1957. DeepSeek demonstrated that the Chinese capabilities had exceeded, far exceeded the Western expectations. And then quite clearly, they had a number of innovations that they introduced, ranging from architectural…
Manav Gupta [45:16]
architectural breakthroughs that slash the memory requirements by as much as 93 % to successful FP8 and we’ll get into some more technical details subsequently, but they trained a 670 billion parameter model, allegedly a much smaller model compared to OpenAI frontier model at the time. And they achieved similar or better performance on a number of tasks. And by the way, not only NVIDIA lost the market cap, The contagion spread across the sector, Broadcom lost 70%, Micron lost 12%. There were certainly billionaire wealth loss, both by Jensen Wong and Larry Ellison as well. So the reason why I bring this up is that in addition to those hidden costs, the frontier model providers are now facing headwinds from new entrants into the market, such as DeepSeek And then maybe final couple of things that I want to talk about uh within the promise versus reality and within certainly the enterprises on ROI. So very famously, there was a study done by MIT, which claimed that 70 % of the projects failed to deliver promised ROI.
Manav Gupta [46:35]
And they broke it down into whether it was, you know, hidden costs, not in quotes, such as the cost of implementation, cost of integration, certainly the big kahuna, which is the cost of bringing the right data to the model because the model is only as good as the data. uh Of course, once you implement our solution, uh you need some ongoing tuning. That had some additional costs as well. uh And then there was a reality check around time to value because the vendors claimed that they could achieve ROI within three to six months, the actual reality was the projects had to run as long as 12 to 18 months. And what this did is it gave the enterprises a pause that the ROI success factors around APIs, vertical AI, API services, there is some churn and it gave industries a pause around how they go about implementing their AI. So this was the first, not a stop.
Manav Gupta [47:46]
but a pause that the enterprise adoption started to slow down when it came to adoption of AI. Perhaps one final page on um these significant investments that are being made. So Arvind Krishna, who is the CEO of IBM, he went on to a podcast on the verge to talk about how uh there is a gap in the math for all of the infrastructure investments that are being made. And the punchline here is as follows. Arvind’s calculation is that for all of the investment that’s already committed to build, um which is allegedly as much as 8 trillion dollars. It’s going to require as much as 800 billion dollars of annual profit just to pay back the interest.
Manav Gupta [48:35]
Let’s pause for a second. The current committed build outs from OpenAI slash target $1.4 trillion Meta capex $70 to $72 billion. Alphabet is only $90 billion and such. So the position that Arvind is taking is that if a single company is committed to spending 20 to 30 gigawatts in data center capacity alone, not taking into account a hardware refresh cycle of say approximately five months. So even if they get so advanced that they can repeat the x-ray Colossus data center build out and they build these things in span of 122 days and rack and stack those from from build out to training they can do that really really quickly.
Manav Gupta [49:23]
Well that only works if they have unlimited power which they don’t. So if these companies are committing 20 to 30 gigawatts of data centers to fill that data center they need as much as 80 billion dollars for every one gigawatt of data center. You times that by the 100 gigawatt that’s committed to spend annually and you’re looking at a impossible number. Now, keep that in mind that, you know, this is one individual’s view, but I think it’s certainly a good back of a napkin math that says that, okay, you know, there is some very bullish ah projections that have been made by the organizations and potentially they’ll have to look at new business cases. And hence, this explains some of those circular economy loops as well. All right, let’s take a step back.
Manav Gupta [50:18]
we talked about… We talked about. We talked about a number of things so far. Let’s now really talk about the path forward.
Manav Gupta [50:32]
So what is it lies in store for these organizations? You’ve seen a version of this page previously. What this is talking about is performance per dollar is improving rapidly, which means that the hardware at any given precision and fixed performance level becomes 30 % cheaper year on year. Right? I mean, this is the Moore’s law really is what we’re talking about here, right? So new entrance can gain, can leverage efficiency gains.
Manav Gupta [51:01]
I talked about this a little bit as well, that the inferencing, inference prices are falling year on year, you know, as much as 900X decline in inferencing, depending upon the task that at hand. In this scenario, for example, GPT-4, It used to cost about $40 per million tokens. 14 months later when Gemini 1.5 Flash came out, it could beat GPT-4 score on a set of PhD level tasks for basically a fraction of the cost. Right. So this tells you how the inference pricing or prices are decreasing.
Manav Gupta [51:42]
I mean, that’s what we covered in the previous section. So what all of this is leading to is a third vector into the economy. So, so far we talked about just the frontier models. But if you take into account all of the headwinds that are facing these companies, we are now beginning to see a slowdown in the exponential growth of planetary scale LLMs. And it’s giving rise to what the industry is now beginning to call small language models. The idea behind these small language models is that these are specialized, efficient models that are trained for specific use cases.
Manav Gupta [52:23]
Right. And let’s just talk about that in a bit more detail. The idea behind a small language model is, I mean, there is no clear definition to be clear. There is no…
Manav Gupta [52:42]
there is no consensus between what constitutes a small language model, but broadly speaking, let’s say somewhere between 7 to 12 billion parameter model is a small language model. And rather than being generalist on everything, so they are going to have general knowledge of English or language understanding, but they will have some emerging capabilities like reasoning and logic because they are smaller. Therefore the training costs are going to be lower. The inferencing speeds is actually going to be fast because they’re just so much smaller, so they don’t really need necessarily specialized hardware, which means they don’t need as deep as heavy GPUs. This also means that rather than being deployed only in the cloud, only in the data center, they can be They can be fine-tuned for our use case. They will still have, you know, on usually a decoder only architecture and they might be available.
21 clips from this episode
Alex LaPlante is VP of Cash Management Technology at RBC, former Interim Head of Borealis AI, co-author in Harvard Business Review, and a member of Canada's federal AI Strategy Task Force. In this ...
Seven Minutes to Midnight: AGI Is Coming What is AGI? When is it arriving? And what does it mean for your career, your organization, and humanity?In this episode of Ship AI, Manav Gupta delivers on...
The episode explores the rise of AI agents, their evolution from chatbots, and the challenges and opportunities in deploying and scaling AI agents. It delves into the characteristics of AI agents, ...