The GPU rental market has decoupled. A raw NVIDIA H100 can now be sourced for as little as $1.39/hr on decentralized marketplaces, while major hyperscalers—AWS, Azure, and Google Cloud—maintain price points that can exceed $98/hr. This discrepancy is not a simple pricing error; it marks a fundamental split between utility-grade compute and enterprise-integrated infrastructure.

| Market Segment | Representative Providers | Typical Price Range ($/hr) | Core Value Prop |
|---|---|---|---|
| Marketplace/Specialized | Vast.ai, RunPod, Together.ai | $1.39 – $4.25 | Raw compute, low overhead |
| Hyperscalers | AWS, Azure, GCP | $30.00 – $98.00+ | Compliance, SLAs, Ecosystem |
The Death of a Single Market Price
The concept of a "market price" for an H100 is an illusion. Data suggests that on-demand rental capacity for high-tier GPUs is effectively exhausted, as those currently holding capacity are refusing to relinquish it despite price volatility.

The Marketplace Tier: These platforms treat compute as a volatile, raw asset. Prices are responsive, thin, and subject to the availability of decentralized hardware nodes.
The Hyperscaler Tier: These entities are no longer selling "chips"; they are selling risk mitigation. Organizations paying the ~50x premium are purchasing compliance certifications, legal indemnities, and deep integration with proprietary software ecosystems.
Why the Gap Persists
Observers often mistake the cost of hardware access for the cost of the service wrapper. Specialized providers offer "pure compute" at a lower price point, but they lack the heavy scaffolding required by massive, risk-averse institutions.

Egress and Ancillary Costs: Hyperscalers often bake an additional 20% to 40% in hidden fees—egress charges, networking, and storage—into their bottom lines.
The Compliance Premium: For the enterprise, the cost of an H100 includes the implicit cost of a Service Level Agreement (SLA). If a workload crashes in a cheap marketplace, the loss is internal. If it crashes on a hyperscaler, there is a contractual remedy.
Utilization as Signal: In the on-demand space, utilization—not price—is the only high-frequency indicator of true demand. When availability vanishes, prices don't just shift; they effectively lock up.
Background: The Architecture of Scarcity
The H100 entered the global market during a period of acute supply contraction. This created a legacy of "pricing opacity" where providers utilized vastly different models to manage scarcity.
Read More: AI Shows England's Productivity Gaps Are More Complex Than North-South Divide

Current analysis indicates a trend toward a hybrid strategy: teams are training models on specialized, high-performance clusters to save capital, while pushing production inference workloads into hyperscale environments that prioritize uptime and reliability over the raw cost per clock cycle. The market is not "crashing" in a traditional sense; it is segregating into two distinct realities—one defined by the cost of electricity and hardware, the other by the cost of corporate security.