A 100-GPU H100 cluster needs roughly 145 kW of contracted power capacity. At $80/kW/month wholesale rates, that’s $11,600/month. At $300/kW/month in a Tier 4 New York facility, it’s $43,500/month. 3.75× variance for the same physical infrastructure — and the cost flows directly into every GPU-hour of compute output.
This guide explains how colocation pricing actually works, what drives the cross-market variance, how to translate $/kW/month rates into per-GPU-hour cost, and where the optimization levers actually sit. For the broader own-side TCO context: 100 H100 Cluster TCO.
TL;DR
- Colocation is priced per kW of contracted power capacity, not per square foot or per server.
- 2026 market rates range from $80/kW/month (wholesale) to $300/kW/month (premium metro retail).
- Top-tier metro markets (NYC, SF, London, Tokyo) charge premium rates; lower-tier markets (Phoenix, Stockholm, Singapore) run 30–50% cheaper.
- PUE matters as much as the rate — a facility with PUE 1.2 delivers ~7% more useful compute per dollar than one at PUE 1.5.
- Wholesale tier ($80–$130/kW/month) requires multi-MW commitments — typically only relevant at 500+ GPU scale.
How colocation pricing works
Colocation contracts price by contracted power capacity measured in kilowatts. The basic structure:
// text
Monthly cost = Contracted_kW × $/kW/month rate × (12 months for annual)You’re paying for power delivery capacity, not actual consumption. If you contract 100 kW and only use 70 kW on average, you still pay for 100 kW. This makes power capacity sizing a real economic decision.
What’s typically included in $/kW/month:
- Rack space (proportional to power capacity, typically 1 rack per 5–10 kW)
- Power delivery infrastructure (PDUs, UPS, generators)
- Cooling infrastructure (CRAC units, cold aisle containment, etc.)
- Network connectivity (basic — fiber pairs, peering)
- Physical security (badge access, cameras, mantraps)
- Smart hands (basic remote support, plugging cables)
Not typically included:
- High-speed networking fabric (your responsibility)
- Storage subsystems
- Compliance attestations beyond facility-level
- Premium support tiers
- Hardware procurement and installation
The PUE multiplier
The $/kW/month rate is for IT load (the power your equipment actually draws). The facility consumes additional power for cooling, lighting, and overhead. This is captured in Power Usage Effectiveness (PUE):
// text
PUE = Total Facility Power / IT Equipment Power| PUE level | What it indicates |
|---|---|
| 1.5 | Industry average; older or less efficient facilities |
| 1.3 | Decent newer facility |
| 1.2 | AI-optimized colocation in 2026 |
| 1.1 | Best-in-class hyperscaler facility |
| 1.05 | Cutting-edge with liquid cooling |
In most colocation contracts, the cooling overhead is bundled into the $/kW/month rate — the colo provider handles cooling and you pay for IT capacity. So you don’t pay separately for the PUE overhead. But PUE still affects you indirectly: a facility with worse PUE has higher operating costs, which they pass through to your $/kW rate over time.
For AI workloads at high power density, AI-optimized facilities with PUE 1.2 or better are worth a small premium over older general-purpose colos at PUE 1.5+.
2026 market rates by tier
| Market tier | $/kW/month | Examples |
|---|---|---|
| Premium metro retail | $200 – $300 | Tier 4 facilities in NYC, SF, London, Tokyo, HK |
| Standard metro | $130 – $180 | Dallas, Chicago, Frankfurt, Sydney, Singapore |
| Lower-tier metro | $80 – $130 | Phoenix, Atlanta, Stockholm, Toronto, Mumbai |
| Wholesale build-to-suit | $50 – $100 | 5+ MW commitments; bespoke arrangements |
Three observations:
1. Geographic variance is enormous
A 100-GPU cluster in NYC (Tier 4, $250/kW/month) costs $36K/month for colocation. The same cluster in Phoenix ($110/kW/month) costs $16K/month — saving $240K/year on a single line item. For most workloads, the geographic differential isn’t a tradeoff; latency to most users is similar, and the savings are real.
2. Wholesale tier is a different product
At 5+ MW commitments (typically 3,000+ GPU scale), wholesale build-to-suit pricing kicks in at $50–$100/kW/month. This is where hyperscalers operate. For most institutional buyers, it’s not accessible — but it explains why hyperscalers’ cost structure for power-intensive infrastructure is genuinely lower than retail-tier colocation.
3. Geographic redundancy costs more
Multi-region deployments multiply colocation cost. Running 100 GPUs in two metros doubles the contracted power capacity (and the bill) for redundancy. For most workloads, single-region with disaster-recovery cold standby beats dual-region active-active on cost.
For an interactive map of AI-optimized colocation facilities globally with current pricing: Mercatus Compute Atlas.
Translating $/kW/month to per-GPU-hour cost
A worked example for a 100-GPU H100 cluster:
// text
Cluster IT load: 145 kW
Colocation rate: $150/kW/month
Monthly colo cost: $21,750
Annual colo cost: $261,000
3-year colo cost: $783,000Per GPU-hour at 70% utilization:
// text
Per GPU per year: $2,610 colocation
Per GPU per powered hour: $0.30
Per GPU per useful hour at 70% util: $0.43Across the cluster, colocation runs ~$0.30–$0.43 per GPU-useful-hour at standard rates — a meaningful 13–20% of the $2.30–$3.30 total cost per GPU-hour.
For different colocation tiers, the same 100-GPU cluster costs:
| Colo tier | Monthly cost | $/GPU-hour at 70% util |
|---|---|---|
| Premium metro ($250/kW) | $36,250 | $0.71 |
| Standard metro ($150/kW) | $21,750 | $0.43 |
| Lower-tier metro ($110/kW) | $15,950 | $0.31 |
| Wholesale ($90/kW) | $13,050 | $0.26 |
Geographic colocation choice can swing per-GPU-hour cost by $0.45 — almost as much as the entire ops budget for the same cluster.
Hidden colocation costs
Five line items that don’t show up in the $/kW/month rate but show up in real bills:
1. Setup and remote hands fees
One-time setup fees (rack installation, network configuration) typically run $5,000–$15,000 per cluster deployment. Ongoing remote hands services (beyond basic) cost $150–$300 per service ticket. For active operations, plan $10,000–$30,000/year in remote hands beyond included services.
2. Bandwidth and transit
Most colocation contracts include basic connectivity but charge for high-bandwidth transit. For AI workloads with significant inference traffic, transit costs run $2,000–$8,000/month. This is often ignored in colo TCO models.
3. Cross-connects and meet-me rooms
If you need to connect to specific cloud providers, networks, or peering exchanges, cross-connects cost $200–$500/month each. A typical AI cluster needs 4–8 cross-connects, adding $1,000–$4,000/month.
4. Power capacity over-commitment
You contract for the peak power your cluster might draw, not average. A 100-GPU H100 cluster’s peak draw with full load is ~165 kW; with safety margin, you’d contract 175–200 kW. The 20–30% over-commitment is real cost paid for capacity you’ll rarely fully use.
5. Compliance attestations beyond facility-level
If your customers require specific compliance (HIPAA, FedRAMP, ISO 27001, SOC 2 Type II) and the facility’s attestations don’t cover your specific deployment, you’ll need additional audits and security controls. Plan $30,000–$80,000/year for compliance overhead at this layer.
Total hidden colocation costs for a 100-GPU cluster: typically $80,000–$150,000/year above the headline $/kW/month rate. This is roughly 30–50% additional on top of the base contract — a significant TCO line item often missed in initial planning.
How colocation cost flows into token prices
When you pay $5/1M output tokens for an LLM API in 2026, a portion of that price (typically 5–15%) reflects colocation costs at the underlying provider’s facilities. Per million output tokens of a 70B model:
- Compute-cost component: ~$2–3
- Colocation share: ~$0.30–$0.70
Cross-provider per-token pricing variance reflects underlying colocation cost variance among providers. Providers in low-cost markets pass savings through; providers in premium metros don’t.
This is the supply-side reality that makes Mercatus Token Index useful — it surfaces the cleared price across providers, not the list price at any single one.
For the full thesis on why this market structure is changing: The Open AI Compute Economy.
What this means for owner-operators (and Providers)
If you operate AI infrastructure, colocation choice is one of three or four cost levers worth aggressive optimization. For institutional buyers building toward 500+ GPU scale, wholesale-tier facilities ($50–$100/kW/month) save $1M+/year vs standard metro colocation — well above the operational complexity of negotiating directly with facility operators.
For Providers listing inference capacity on Mercatus: facility location and power efficiency directly affect your competitive cost structure. Providers operating in low-cost markets with PUE 1.2 facilities can offer per-token pricing that hyperscalers in PUE 1.5 metros structurally can’t match.
→ Become a Provider to monetize cluster capacity. → Compute Atlas for the global facility map.
Frequently Asked Questions
How much does colocation cost for AI compute in 2026?
$80–$300/kW/month depending on market tier and contract size. Standard metro markets (Dallas, Chicago, Frankfurt) run $130–$180/kW/month; premium metros (NYC, SF) run $200–$300; lower-tier metros run $80–$130; wholesale build-to-suit at 5+ MW commitments runs $50–$100.
What’s the colocation cost per GPU-hour for an H100 cluster?
For a 100-GPU H100 cluster (145 kW IT load) at $150/kW/month standard metro rate: $0.30 per powered GPU-hour, or $0.43 per useful GPU-hour at 70% utilization. At wholesale rates ($90/kW), drops to $0.26 per useful hour.
Should I choose lower-cost markets to save on colocation?
Yes for most workloads. Geographic colocation differential between premium and lower-tier metros is $100–$170/kW/month — for a 100-GPU cluster, $145K–$246K/year savings. Latency to most users is similar across continental US/EU, so the savings rarely come with offsetting cost.
How does PUE affect colocation cost?
PUE is bundled into the $/kW/month rate at most facilities. But facilities with worse PUE have higher operating costs they pass through. AI-optimized facilities at PUE 1.2 are worth a small premium over older facilities at PUE 1.5+ for high-power-density deployments.
What’s wholesale colocation and when does it apply?
Build-to-suit wholesale facilities are bespoke arrangements at 5+ MW commitments — typically 3,000+ H100s scale. Pricing drops to $50–$100/kW/month. Most institutional buyers don’t reach this scale. Hyperscalers operate primarily in this tier, which explains some of their cost advantage.
Where can I find AI-optimized colocation facilities?
Mercatus Compute Atlas maps AI-optimized colocation facilities globally with PUE, power capacity, and pricing data. Useful for cluster siting decisions and provider operational benchmarking.
Methodology
Colocation pricing data sourced from Mercatus Compute Atlas, May 2026 snapshot. Cost calculations assume 100-H100 cluster with 145 kW IT load, 70% utilization. Hidden cost estimates derived from typical institutional colocation contracts. Last verified: 2026-05-04.
