H100 GPU Cost in 2026: What You Actually Pay (Beyond the Sticker Price)

An NVIDIA H100 SXM5 lists for $25,000–$30,000 from major OEMs.

That number is technically correct, almost never useful, and frequently misleading. Whether you buy, rent, colocate, or run on-prem, the effective per-hour cost of an H100 in 2026 ranges from $0.80 to $4.20 depending on the path you choose — a 5× spread for the same physical card.

This guide breaks down every component, with current pricing from Mercatus GPU Index, and gives you the formula to plug your own assumptions into. If you’re trying to figure out whether the H100 cloud price you’re being quoted is reasonable, or whether owning makes more sense than renting, this is the math.

For the cluster-scale version (100+ H100s, institutional capex), see the companion piece Total Cost to Own 100 H100 GPUs. For the broader generation comparison, see the pillar A100 vs H100 vs H200.

The sticker price: what an H100 costs to buy

There are three ways to acquire an H100, and the headline price differs for each:

Acquisition path	Price range (2026)	Notes
Single H100 SXM5 (OEM, GPU only)	$25,000 – $30,000	Rarely sold standalone — most go through OEM systems
8-GPU HGX H100 server (Supermicro, Dell, HPE)	$250,000 – $320,000	$31K–40K per GPU including server, NVLink, networking
H100 PCIe (single card, slot form factor)	$22,000 – $27,000	10–15% cheaper but no NVLink — single-GPU only

For most institutional buyers, the relevant number is the 8-GPU HGX system price because H100s are deployed in 8-GPU nodes. Per-GPU cost works out to ~$33,000 once you fold in the server, NVSwitch fabric, NICs, and integration.

Why does the same H100 SXM5 list at different prices?

Volume tier. OEMs price by quantity. A 100-server order beats a 1-server quote by 8–12% in 2026.
Region. EU and APAC pricing typically runs 5–10% above US prices because of import duties and logistics.
OEM choice. Supermicro is consistently cheapest; Dell and HPE charge premiums for support tiers. Lambda and several specialty integrators sit in between.
Service contract. “GPU-only” pricing strips support; full-service (4-hour replacement, on-site engineering) adds 8–15%.

These are wholesale numbers. They don’t include power, colocation, networking infrastructure outside the box, or ops. We’ll get to those.

The cloud price: what the same H100 actually rents for

Most teams don’t buy H100s. They rent. And here’s where the real story is.

A snapshot of H100 80GB SXM5 cloud pricing on Mercatus GPU Index in May 2026:

Provider tier	On-demand $/hr	Reserved (1yr) $/hr	Reserved (3yr) $/hr
Hyperscalers (AWS, Azure, GCP)	$3.50 – $5.00	$2.80 – $3.80	$2.20 – $3.00
Tier-1 specialty (CoreWeave, Lambda)	$2.50 – $3.50	$2.00 – $2.80	$1.70 – $2.30
Long-tail and regional	$1.99 – $2.50	$1.60 – $2.10	$1.30 – $1.80

The same H100 SXM5 — same silicon, same FP8 throughput, same 80GB HBM3, same NVLink — prices at 2.5× the rate at a hyperscaler vs. a long-tail provider. That’s not a 10% premium for ecosystem integration. That’s a multi-thousand-dollar-per-month decision.

Reserved capacity drops everything 30–50% below on-demand at the same provider. A 3-year reserved H100 from a long-tail provider lands around $1.30–$1.80/hr — close enough to owned-and-amortized economics that the buy/rent decision tilts on operational appetite, not cost.

For full breakdown of why the same SKU prices so differently, see Why GPU Prices Differ by 30%+ for the Same Hardware.

The practical takeaway: never accept hyperscaler on-demand pricing for an H100 unless you have a specific reason to. The savings from shopping providers are routinely 40–60% — well above any conceivable switching cost.

→ For continuously updated cross-provider pricing, see Mercatus GPU Index.

Why the cloud price is what it is

Provider markup on H100 cloud rentals breaks down roughly as follows. These are 2026 industry estimates; exact margins vary by provider.

Cost component (per GPU-hour)	Long-tail provider	Hyperscaler
Hardware amortization (3yr, 75% util)	$0.55	$0.55
Power	$0.10 – $0.25	$0.10 – $0.20
Colocation	$0.08 – $0.15	$0.05 – $0.10
Networking + storage	$0.05 – $0.10	$0.10 – $0.20
Ops + customer support	$0.05 – $0.15	$0.30 – $0.60
Sales + marketing overhead	$0.05 – $0.10	$0.50 – $1.00
Margin	$0.20 – $0.50	$1.00 – $2.00
Total $/hr	$1.99 – $2.50	$3.50 – $5.00

The hyperscaler premium is mostly sales overhead and margin, not infrastructure cost. Hyperscaler power and colocation are actually cheaper per kW than most long-tail providers (scale economies). What drives their per-GPU-hour up is the cost structure of selling enterprise contracts and the margin expectations of public-company shareholders.

Specialty providers (CoreWeave, Lambda) split the difference: hyperscaler-quality reliability with leaner sales overhead. Long-tail providers (often regional, often founder-led, often running on lower-cost regional power) win on raw price.

What an owned H100 actually costs per hour

If you buy an H100, your per-GPU-hour cost depends on five inputs:

Hardware capex ($31,000–$40,000 per GPU including server)

Useful lifespan (3–4 years for production; secondary market value at end-of-life)

Utilization rate (% of hours actually serving workload)

Power cost (depends heavily on region — $0.06–$0.25/kWh)

Colocation + ops (varies by deployment)

Worked example for a single H100 in a US Tier-3 colo, 70% utilization, 3-year horizon:

Hardware: $33,000 capex
− $8,000 estimated 3-year resale value (~25% retained)
= $25,000 net depreciation
÷ (3 years × 8,760 hours × 0.70 utilization)
= $1.36 / GPU-hour amortized

Power: 700W TDP × 1.4 PUE × $0.10/kWh = $0.098 / GPU-hour

Colo: $150/kW/month × 1.0 kW per GPU ÷ 730 hrs/month = $0.21 / GPU-hour

Ops + maintenance: ~$0.10 / GPU-hour (failures, replacements, monitoring)

Total effective cost: ~$1.77 / GPU-hour all-in

At 70% utilization, an owned H100 costs roughly $1.40–$1.80/GPU-hour depending on power and colocation specifics. At 90% utilization, it drops to roughly $1.20/hr. At 40%, it climbs to $2.50/hr+ — at which point cloud rentals from long-tail providers usually win.

The general formula:

// text
Effective $/GPU-hour =
    (Capex − Resale_value) / (Lifespan_hours × Utilization)
    + Power_$/hr
    + Colo_$/hr
    + Ops_$/hr

Plug in your own numbers. The two variables that move the answer most are utilization and power cost.

When does buying a single H100 make sense?

Almost never. The honest answer is that single-H100 ownership is rarely the right choice in 2026, and the reasons are structural:

Lumpy capex. $33,000 is a meaningful capital commitment for one GPU. Your finance team will scrutinize it harder than a $25,000/year cloud spend, even though they’re equivalent in NPV terms.

Operational overhead. A single H100 needs a colo contract, a server, networking, monitoring, replacement parts. The ops surface is the same as for 10 H100s — just amortized worse.

No utilization smoothing. A single GPU runs the workload you have or sits idle. With cloud, you pay for what you use.

Resale risk. A single GPU on the secondary market in 3 years is harder to dispose of than a node from a fleet rotation.

Single-H100 ownership makes sense in a narrow band of cases:

Deep, sustained 90%+ utilization with high workload predictability
Latency-sensitive workloads where colocating with your data is critical
Compliance constraints (data residency, security clearances) that prevent cloud usage
You already operate a colo and adding one GPU is marginal

Outside those scenarios, the math favors reserved cloud capacity from a long-tail provider for stable workloads, and on-demand from the same providers for variable workloads. Owning a single H100 is rarely the optimal answer.

For the cluster-scale version of this analysis — where ownership economics shift dramatically — see Total Cost to Own 100 H100 GPUs.

What if you already own H100s and have spare capacity?

Most owners don’t run their fleet at 90%+ utilization. Real-world cluster utilization sits at 40–70% for most teams — meaning a meaningful fraction of GPU-hours are sunk cost.

If you operate H100s with consistent spare capacity, that capacity has commercial value. You’re paying full ownership cost (capex amortization + power + colo + ops) whether the cards are running your workload or sitting idle.

Selling the idle capacity as inference tokens recovers cost without disrupting your primary workload. You list your endpoint on Mercatus, set your prices, and serve traffic from buyers who route through the unified API. Revenue at $1.50–$2.50/GPU-hour for unused capacity directly reduces your effective per-GPU-hour cost on the workload that actually matters.

→ Become a Provider

How H100 cost translates to what buyers pay per token

If you’re an API buyer rather than a GPU operator, the H100 cost analysis above explains your per-token bill upstream.

When you pay $5/1M output tokens for GPT-4o, that price reflects: NVIDIA’s H100 hardware cost flowing through the provider’s amortization, plus their power and colo, plus their margin, plus the inefficiency of the closed market structure (which causes the same hardware to price 2.5× differently across providers).

The cross-provider H100 spread is exactly why per-token prices for the same model vary across API providers. For the connection between hardware economics and token pricing, see the manifesto The Open AI Compute Economy.

For day-to-day buying:

GPU-level pricing: GPU Index
Token-level pricing: Mercatus Spot Market

Frequently Asked Questions

How much does an H100 GPU cost to buy in 2026?

NVIDIA H100 SXM5 OEM pricing ranges from $25,000 to $30,000 in 2026, depending on volume and supplier. The PCIe variant is 10–15% cheaper. 8-GPU HGX servers cost $250,000–$320,000 fully built — about $31,000–$40,000 per GPU including server overhead.

How much does it cost to run an H100 per hour?

Cloud rentals: $1.99–$5.00/GPU-hour in 2026 depending on provider tier. Owned and amortized: ~$1.40/GPU/hour effective at 70% utilization on a 3-year horizon, including power and colocation.

Why do H100 prices differ 30%+ across cloud providers?

The 2.5× spread between hyperscaler on-demand and long-tail providers reflects sales overhead and margin structures, not infrastructure cost. Same hardware, very different cost structures.

Should I buy an H100 or rent one?

For a single H100, almost always rent. Reserved cloud capacity from a long-tail provider runs $1.30–$1.80/hr — within striking distance of owned economics with no ops burden.

How fast does an H100 depreciate?

H100s have held value remarkably well — secondary market prices in 2026 sit at 75–85% of original list for cards 18–30 months old. Expect ~25% loss over 3 years, more if Blackwell shipping accelerates.

What’s the cheapest way to access an H100 right now?

Spot/preemptible instances on long-tail providers ($1.50–$2.20/hr) for batch workloads that tolerate interruption. Reserved 1-year contracts ($1.60–$2.10/hr) for steady production workloads.

What about used or refurbished H100s?

A small but growing secondary market exists. Refurb H100s from data center decommissioning sell for $18,000–$22,000 in 2026, typically with 12-month limited warranties.

Can I sell capacity on H100s I already own?

Yes. List your inference endpoint on Mercatus, set your prices, reach buyers without sales overhead.