Most AI APIs charge per token, which means your bill scales with usage. That's fine for prototyping, but at production scale it becomes unpredictable. A single bad prompt loop can spike your bill 10x overnight.
Plugsky's self-serve plans are unlimited on every model in your tier, billed by the month. You commit to the price, we commit to unlimited usage.
Why unlimited matters
- Predictability: you know the bill before you ship. No meter, no surprises.
- No "free tier that throttles at peak." Real production traffic, not toy demos.
- Encourages innovation: when every call is metered, teams avoid experimentation. When calls are unlimited, teams iterate faster.
- Better unit economics at scale: once you're past ~20M tokens/month, flat pricing is cheaper than per-token on every public AI vendor.
What's included in each tier
Every plan is truly unlimited monthly usage — no token meter, no throttling, no surprise bills. The only cap is the requests-per-minute rate limit, which scales with the plan.
| Plan | Monthly | Models in tier | Token usage / month | Rate limit (req/min) |
|---|---|---|---|---|
| Starter | $20 | micro → pro | Unlimited | 60 |
| Builder | $60 | + max, vision | Unlimited | 300 |
| Scale | $120 | + frontier, reasoning | Unlimited | 1,000 |
What's "unlimited"? Use as many tokens as you want per month — the only cap is the requests-per-minute rate. What's the rate limit? How many API calls you can make in one minute (returns 429 with Retry-After if you hit it). Embeddings, RAG queries, and agent runs all share the same unlimited pool — no add-ons, no per-feature pricing.
When per-token is better
Per-token pricing beats flat pricing only at very low usage. If you're under ~5M tokens/month, you're probably better off on plugsky-micro (free) or the $20 Starter plan with the same effective economics.
Past 50M tokens/month, flat pricing on Plugsky Builder ($60) saves you 60-90% vs. OpenAI gpt-4o for the same workload. Use the calculator for your exact numbers.
Frequently asked questions
Is there really no per-token meter?
Correct. On self-serve plans (Starter, Builder, Scale), you can call the API as much as you want. The only cap is the rate limit (60, 300, 1000 req/min).
What about embeddings and RAG?
Same — unlimited on every plan. Embeddings, RAG queries, agent runs, and tool calls all share the same flat pricing.
What if I exceed the rate limit?
You get a 429 with a Retry-After header. Upgrade to the next tier for higher rate limits, or contact sales for custom limits.
Is there a fair-use policy?
Yes — self-serve plans have a fair-use rate limit (no abuse, no scraping). The rate limit per tier is the only cap. Enterprise contracts have custom limits.
Start unlimited for $5
7-day trial for $5. Then Starter is $20/mo flat — unlimited usage, 18+ models.
Start $5 trial → See full pricing