
According to recent industry reports, Google’s introduction of the TurboQuant compression algorithm has raised concerns about potential impacts on memory chip demand; however, multiple manufacturers at the MemoryS 2026 summit emphasized that the rapid expansion of artificial intelligence (AI) applications will continue to drive growth in the memory semiconductor market, with supply shortages likely to persist.
Bloomberg Intelligence analyst Jake Silverman noted that because model weights must be stored in GPU memory, the algorithm is expected to have limited impact on high-bandwidth memory (HBM) and DRAM, while potentially exerting a longer-term influence on NAND flash demand.
At the same time, the accelerating adoption of AI technologies is becoming a primary growth engine for memory chips. Data from CFM Flash indicates that AI servers are projected to account for more than 20% of global server shipments by 2026, further increasing memory capacity requirements across data center infrastructure.
Tai Wei, General Manager of the flash memory market at CFM, highlighted that with the growing deployment of AI inference applications, enterprise solid-state drives (eSSDs) have become the largest application segment for NAND flash. In contrast, the smartphone market remains relatively subdued, although on-device AI is expected to emerge as a new growth driver.
In terms of pricing, after three consecutive quarters of significant increases in both contract and spot prices, the memory chip market is expected to see a moderation in price growth by the third quarter of 2026. Prices are anticipated to gradually stabilize, with potential divergence across different product segments.
On the supply side, memory manufacturers are adopting a more disciplined and structured capacity expansion strategy, prioritizing advanced, high-value products. Technologies such as hybrid bonding and 3D stacking are becoming critical for enhancing HBM performance. Meanwhile, base dies are gaining strategic importance, taking on more computational functions and helping to reduce latency between memory and processing units in next-generation architectures.