Self-Hosting

Self-hosting requirements calculator

The self-host LLM requirements calculator tells you exactly what hardware you need to run any model locally. Pick a model size (8B, 70B, or 405B), choose your quantization level, and instantly see the VRAM, GPUs, and total build cost — then compare against Plugsky's flat pricing.

Start Free → See pricing Read the docs

Self-hosting calculator

Model size

Quantization

Example output

VRAM needed	~35 GB
Minimum hardware	2x A100 80GB
Build cost	$30,000
Plugsky flat	$60/mo — no hardware
Breakeven (vs Plugsky)	~500 months

Frequently asked questions

How much VRAM do I need to run Llama 70B?

Llama-3.1-70B requires ~48 GB VRAM at Q4 quantization (2× A100 80GB or similar). At FP16 full precision, it needs ~140 GB VRAM (2-4 GPUs).

Last updated Jul 2026. Prices and availability verified at time of writing — check provider pages for current rates.

Skip the cluster — run sovereign on Plugsky

Self-host Plugsky or use our cloud. From $99/mo.

Start free → API docs