Self-hosting calculator
Example output
| VRAM needed | ~35 GB |
| Minimum hardware | 2x A100 80GB |
| Build cost | $30,000 |
| Plugsky flat | $60/mo — no hardware |
| Breakeven (vs Plugsky) | ~500 months |
Frequently asked questions
How much VRAM do I need to run Llama 70B?
Llama-3.1-70B requires ~48 GB VRAM at Q4 quantization (2× A100 80GB or similar). At FP16 full precision, it needs ~140 GB VRAM (2-4 GPUs).
Last updated Jul 2026. Prices and availability verified at time of writing — check provider pages for current rates.
Skip the cluster — run sovereign on Plugsky
Self-host Plugsky or use our cloud. From $99/mo.
Start free → API docs