Self-Hosting

Self-hosting requirements calculator

The self-host LLM requirements calculator tells you exactly what hardware you need to run any model locally. Pick a model size (8B, 70B, or 405B), choose your quantization level, and instantly see the VRAM, GPUs, and total build cost — then compare against Plugsky's flat pricing.

Self-hosting calculator

Example output

VRAM needed~35 GB
Minimum hardware2x A100 80GB
Build cost$30,000
Plugsky flat$60/mo — no hardware
Breakeven (vs Plugsky)~500 months
Frequently asked questions
How much VRAM do I need to run Llama 70B?

Llama-3.1-70B requires ~48 GB VRAM at Q4 quantization (2× A100 80GB or similar). At FP16 full precision, it needs ~140 GB VRAM (2-4 GPUs).

Last updated Jul 2026. Prices and availability verified at time of writing — check provider pages for current rates.

Skip the cluster — run sovereign on Plugsky

Self-host Plugsky or use our cloud. From $99/mo.

Start free → API docs