The competitive pressure around Nvidia is widening fast
Nvidia is getting squeezed from more than one side. Even as it rolled out new hardware at the GTC conference in San Jose last week, the market around it kept shifting. Cloud providers are pushing harder on in-house chips, Broadcom is deepening its role as the go-to builder behind many of those designs, and classic competitors are taking advantage of a market that’s increasingly shaped by inference economics—not just brute-force training capability.
The custom AI chip surge from hyperscalers
Hyperscalers are accelerating in-house silicon to reduce GPU dependence
The most direct challenge comes from the hyperscalers themselves. Google, Amazon, Microsoft, and Meta are all moving faster on internal chip programs aimed at lowering reliance on third-party GPUs.
- Google introduced its seventh-generation TPU, Ironwood, rated at 4.6 petaflops of FP8 compute per chip, with the ability to scale into pods of 9,216 chips.
- Amazon is ramping Trainium 3, built on TSMC’s 3nm process. Production started ramping in early 2026, and AWS says it delivers a 50% cost reduction versus comparable Nvidia-based instances.
- Meta said this month it plans to push forward with four new silicon generations over the next two years.
- Microsoft recently announced its Maia 200 inference chip.
The direction is clear: major cloud platforms are building more of the stack themselves, with a specific focus on improving cost structure and reducing exposure to Nvidia’s pricing and supply dynamics.
Broadcom’s expanding role in custom AI chip design
A lot of this custom silicon has the same architect behind the scenes: Broadcom.
Counterpoint Research projects Broadcom will control about 60% of the AI server compute ASIC design partner market by 2027. The same research forecasts that ASIC shipments among the top ten hyperscalers will triple between 2024 and 2027—a meaningful signal that these efforts are not small experiments but scaling programs.
Broadcom’s AI business is also accelerating:
- $8.4 billion in AI revenue in its most recent quarter
- 106% year-over-year growth
- CEO Hock Tan has said AI chip revenue could exceed $100 billion by 2027
For Nvidia, this matters because every successful hyperscaler ASIC program is another path to inference capacity that doesn’t require Nvidia GPUs.
Nvidia’s response: new platforms and a major inference move
The Vera Rubin platform and the push for inference efficiency
Nvidia isn’t treating this as a slow-moving threat. At GTC 2026, CEO Jensen Huang introduced the Vera Rubin platform. Nvidia says it can deliver up to 10x more inference throughput per watt than the current Blackwell generation.
That framing—throughput per watt—is telling. It aligns directly with the market’s shift toward inference, where efficiency and unit economics increasingly dominate purchasing decisions.
Nvidia’s Groq 3 inference processor and claimed performance gains
Nvidia also disclosed its Groq 3 inference processor, tied to a $20 billion deal to license technology and bring in talent from inference chip startup Groq.
Nvidia says the combined system delivers:
- a 35-fold performance increase at the highest-value inference tier
This is a clear signal that Nvidia is working to defend inference not only with GPUs, but also with inference-specific processor strategy and integration.
The scale of demand Nvidia is still projecting
Huang projected that cumulative orders between 2025 and 2027 could reach $1 trillion.
And on the analyst side, Rosenblatt Securities raised its price target, arguing Nvidia’s full-stack advantage—spanning CUDA software, NVLink networking, and rack-scale systems—keeps it positioned to lead in inference as well as training.
The inference battleground: where the economics bite
Inference now dominates AI compute cycles
The sharpest pressure point is inference. Inference represents about two-thirds of all AI compute cycles, and the cost-per-token math tends to reward specialized hardware over general-purpose GPUs.
That economic reality creates room for hyperscaler chips, ASICs, and alternative platforms to win workloads—especially once deployed at scale inside the largest cloud environments.
Analysts expect share pressure as in-house ASIC programs scale
Reuters reported that some analysts expect Nvidia to start losing share in 2027, once hyperscaler in-house ASIC programs reach meaningful scale in the inference market.
This doesn’t require Nvidia to “lose” on performance alone. If hyperscalers can meet their inference needs at a better cost profile with their own silicon, they can shift volumes even while Nvidia remains strong on the high end.
Software ecosystems: CUDA’s strength and the push to reduce switching friction
CUDA remains Nvidia’s core moat
Nvidia’s strongest defense is still CUDA, supported by over five million active developers. That developer base matters because it lowers deployment friction, speeds iteration, and keeps workloads tightly aligned with Nvidia’s stack.
Competitive stacks are closing gaps and lowering barriers to exit
At the same time, pressure is building from multiple software directions:
- AMD is advancing with its open-source ROCm platform
- Hyperscalers are building their own stacks that reduce the pain of moving away from Nvidia hardware:
- Google’s XLA
- Amazon’s Neuron SDK
- Microsoft’s custom toolchain
The effect isn’t a single clean rivalry. As Business Insider put it, the competitive environment is becoming “a rapidly widening and increasingly tangled field, even as Nvidia remains miles ahead.”

