Per-hour GPU rental
Billed by the second after the first minute, only while your deployment is running.
Transparent per-hour GPU pricing. Your OpenAI-compatible endpoint runs on the GPU you rent — no per-token fees, no surprises.

| GPU | VRAM | vCPU | RAM | Price | Deploy |
|---|---|---|---|---|---|
| More than 80 GB VRAM | |||||
| B200 | 180 GB | 28 vCPU | 283 GB | $5.89 /hr | Deploy |
| H200 | 141 GB | 24 vCPU | 276 GB | $4.39 /hr | Deploy |
| H100 NVL | 94 GB | 16 vCPU | 94 GB | $3.19 /hr | Deploy |
| RTX Pro 6000 | 96 GB | 16 vCPU | 188 GB | $2.09 /hr | Deploy |
| 80 GB VRAM | |||||
| H100 SXM | 80 GB | 20 vCPU | 125 GB | $3.29 /hr | Deploy |
| H100 PCIe | 80 GB | 16 vCPU | 188 GB | $2.89 /hr | Deploy |
| A100 SXM | 80 GB | 16 vCPU | 125 GB | $1.49 /hr | Deploy |
| A100 PCIe | 80 GB | 8 vCPU | 117 GB | $1.39 /hr | Deploy |
| 48 GB VRAM | |||||
| L40 | 48 GB | 8 vCPU | 94 GB | $0.99 /hr | Deploy |
| L40S | 48 GB | 16 vCPU | 94 GB | $0.86 /hr | Deploy |
| RTX 6000 Ada | 48 GB | 10 vCPU | 167 GB | $0.77 /hr | Deploy |
| RTX A6000 | 48 GB | 9 vCPU | 50 GB | $0.49 /hr | Deploy |
| A40 | 48 GB | 9 vCPU | 50 GB | $0.44 /hr | Deploy |
| 32 GB VRAM | |||||
| RTX 5090 | 32 GB | 9 vCPU | 35 GB | $0.99 /hr | Deploy |
| 24 GB VRAM | |||||
| RTX 4090 | 24 GB | 6 vCPU | 41 GB | $0.69 /hr | Deploy |
| RTX 3090 | 24 GB | 16 vCPU | 125 GB | $0.46 /hr | Deploy |
| L4 | 24 GB | 12 vCPU | 50 GB | $0.39 /hr | Deploy |
| RTX A5000 | 24 GB | 9 vCPU | 25 GB | $0.27 /hr | Deploy |
Indicative pricing — final rates vary by region & availability.
Billed by the second after the first minute, only while your deployment is running.
You rent the GPU; the inference throughput it produces is entirely yours.
Same GPU pricing whether you use our catalog, your weights, or your container.
| Type | Price |
|---|---|
| Volume disk | $0.10 / GB / mo |
| Network storage | $0.07 / GB / mo |
Indicative pricing — final rates vary by region & availability.
Per hour while a deployment is running. Stop it and billing stops — you only pay for the time your GPU is up.
No. You rent the GPU, and the OpenAI-compatible endpoint that runs on it is included. There are no per-token or per-request fees.
Yes — push your own weights or a custom container. It runs on the same per-hour GPU pricing as the catalog.
No. Pricing is on-demand and pay as you go. Reserved-capacity discounts are coming later.
More questions? Get started and reach us from the dashboard.