Ivan Paudice LIVE

Ivan Paudice

Innovation Lead

I tried to buy RAM last week. Nothing exotic, just a 32 GB kit for a home lab machine I'm putting together. The Corsair Vengeance kit I had bookmarked at around 80 euros now costs over 240. Tripled. In four months.

My first thought was that I'd missed a product refresh or a supply hiccup. Then I checked Samsung SSDs. Same story. Then server memory. Same story. Then I watched Kobe's Code break down the global DRAM shortage crisis, and the numbers clicked into a pattern that anyone working with AI infrastructure should understand. Because this shortage isn't a blip. It's the physical cost of the AI buildout we're all living through, and it touches every device that uses memory of any kind.

Samsung is rumored to be raising prices by 80% across all memory products. Not just HBM for data centers. All of it. Consumer RAM, NAND flash, SSDs, the memory in your phone. And Samsung isn't doing this alone. The entire supply chain is tightening.

If you build with AI, teach about AI, or lead teams adopting AI, the memory market is where the abstract promise of artificial intelligence meets very concrete economics.

‍

Why Memory Is the Real AI Bottleneck

Most conversations about AI infrastructure focus on GPUs. How many H100s can you buy? When does the next Nvidia architecture ship? Those are valid questions, but they miss the bottleneck that actually determines how well AI models run: memory bandwidth.

AI models need to move massive amounts of data through the chip every second. A model like DeepSeek R1 with 671 billion parameters needs at least 130 GB of memory even at aggressive 1.58-bit quantization. Your typical consumer PC has 8 to 32 GB of RAM. That's not even close.

But the problem isn't just capacity. It's speed. Your CPU talks to RAM at maybe 50 to 60 gigabytes per second. Fine for spreadsheets, painful for neural networks that rely on constant matrix multiplications. Running DeepSeek R1 off your system RAM would give you maybe 10 tokens per second. Unusable for production, barely tolerable for experimentation.

This is why AI runs on GPUs with High Bandwidth Memory (HBM). An Nvidia H100, the workhorse of today's data centers, achieves around 3.3 terabytes per second of memory throughput. That's roughly 55 times faster than your CPU/RAM setup. The difference between waiting 30 seconds for a response and getting it in under a second comes down to memory bandwidth. Not compute. Memory.

Neural networks are memory-bound, not compute-bound. You can always throw more GPUs at a compute problem. But serving models fast enough for real users requires memory that can keep up with the math. HBM is that memory, and every major AI company on the planet wants as much of it as possible.

‍

The Supply Chain Squeeze

The demand side is staggering. OpenAI, Anthropic, Google, xAI, Meta. Each one is building or expanding data centers that consume enormous quantities of HBM chips. Then add the hyperscalers (Microsoft, Oracle, Amazon) who host these models. Add inference chip makers like Groq and Cerebras. Add the neoclouds (CoreWeave, Nebius, and a growing list of smaller GPU cloud providers). Add Google's TPUs, Amazon's custom ASICs. Every one of these players needs high bandwidth memory.

The manufacturers who supply that memory are a surprisingly small group. Samsung, SK Hynix, and Micron dominate. Western Digital and Kioxia round out the NAND side. That's essentially the entire global memory supply controlled by a handful of companies.

Now here's where it gets interesting for consumers. Building HBM chips and building regular DRAM share the same manufacturing infrastructure. When Samsung or SK Hynix retool a production line for HBM, that capacity stops producing consumer grade memory. SK Hynix has already cut NAND production by 10% to ramp up HBM output. Micron is taking a conservative stance on consumer production. Kioxia reduced output from 4.8 million units to 4.7 million.

The manufacturers are making a rational economic choice. HBM for AI sells at much higher margins than the DDR5 in your laptop. If you ran a memory fab, you'd make the same call. But the consequence is a supply squeeze on everything else: desktop RAM, laptop memory, smartphone components, SSDs, automotive chips, industrial controllers. Anything that uses memory.

And then there's the OpenAI deal. Reports indicate that OpenAI has secured up to 900,000 DRAM wafers per month from Samsung and SK Hynix to supply the Stargate facility. Some analysts estimate this represents around 40% of global DRAM output. One company, consuming nearly half the world's memory production for a single project. When that kind of concentration hits the supply chain, prices don't just rise. They spike.

‍

The Oligopoly Problem

The DRAM market has a complicated relationship with fair pricing. In 2016, the U.S. Department of Justice charged Samsung executives with price fixing. They pleaded guilty and paid $4 million each. That wasn't an isolated incident. The memory industry has faced repeated allegations over the years, and the structure of the market makes it easy to see why. Three companies control the vast majority of global supply. Coordination, whether explicit or tacit, is structurally easier when only three players matter.

The industry is also famously cyclical. In 2017, a boom cycle driven by falling inventory sent prices soaring. After 2020, the pandemic supply chain chaos led to overproduction that crashed prices in 2023. Now inventory is decelerating again, and the cycle is swinging back toward shortage. But this time it's paired with structural AI demand that didn't exist in previous cycles.

That pairing is what makes this different. Previous DRAM supercycles were driven by temporary imbalances: a factory shutdown, a demand surge for smartphones, a pandemic that disrupted shipping. Those imbalances correct themselves within 12 to 18 months. AI demand for HBM is not temporary. It's accelerating. Every new model generation is larger. Every new data center needs more memory. The structural floor under memory prices has shifted upward, and the cyclical shortage is stacking on top.

Samsung's reported 80% price increase across all memory products is the surface symptom. The underlying condition is a permanent reallocation of global memory manufacturing toward AI, at the expense of consumer electronics.

‍

What This Means if You Build Things

I think about this through two lenses: as someone who builds with AI daily, and as someone who teaches organizations about AI adoption.

From the builder's perspective, the memory squeeze reinforces a trend I've been writing about: the economics of AI are consolidating around a few critical bottlenecks. I wrote about Minimax M2.5 and how inference costs dropped 97%. I wrote about Moonshot's Kimi K2.5 and how orchestration is moving into the model itself, cutting engineering overhead. Both stories were about costs falling. The memory story goes the other direction. Even as software costs decline, the hardware underneath is getting more expensive and more scarce. Cheaper inference doesn't help if you can't get the chips to run it.

For the executives I teach, this creates a planning problem. AI adoption roadmaps typically model compute costs declining over time, following the pattern of every previous technology wave. Memory costs are now moving in the opposite direction, at least for the next two to three years. Organizations sizing their AI infrastructure budgets need to account for this. The cost of running AI workloads may not follow the smooth downward curve that the GPU pricing story suggests.

There's also the geopolitical dimension. China's largest memory fab, CXMT, is targeting mass production at $138 per unit and planning a $4.2 billion IPO. But their production line is still on DDR4, which puts them three to five years behind the leading manufacturers on technology. China could eventually become a significant alternative supplier, especially for consumer grade memory where cutting edge process nodes matter less. But "eventually" means years. In the near term, Samsung, SK Hynix, and Micron set the prices, and those prices reflect a world where AI gets served first.

New fabs take two to five years to build from announcement to volume production. The decisions being made right now about what to manufacture and for whom will determine memory pricing through 2028 at minimum. By the time new capacity comes online, the AI demand will have grown further. The shortage may ease, but the structural rebalancing toward AI is not reversing.

‍

The Physical Layer Always Matters

Every time a new technology wave arrives, we collectively forget that software runs on hardware. Cloud computing created the illusion that infrastructure is infinitely elastic. You need more compute? Spin up another instance. You need more storage? Click a button. The DRAM shortage is a reminder that underneath every cloud instance is a physical chip, manufactured in a physical fab, using physical materials, by companies making economic choices about what to produce.

AI's appetite for memory is reshaping the global semiconductor supply chain. That reshaping benefits anyone selling into the AI boom. It costs everyone else. Your next phone, your next laptop, your next SSD: all more expensive because the same factories that would have made those components are now producing HBM for data centers.

I don't think this slows AI adoption. The economics are too compelling on the AI side. A single HBM chip generates far more revenue than a consumer RAM stick, and the demand shows no sign of plateauing. But it does mean that the cost of participating in the AI economy, whether as a builder, a consumer, or an organization deploying AI tools, now includes a memory tax that didn't exist two years ago.

Build your budgets accordingly.

‍

The Memory Tax

Why Memory Is the Real AI Bottleneck

The Supply Chain Squeeze

The Oligopoly Problem

What This Means if You Build Things

The Physical Layer Always Matters

Other notes.

The 10-Minute War

The Clawdbot Paradox