One OpenAI-compatible API for every top model — predictable pricing, no rate caps. Run it on our cloud, your cloud, or on-prem in any country.
$ curl https://api.plugsky.com/v1/chat/completions \ -H "Authorization: Bearer sk-live-…" \ -d '{ "model": "plugsky-pro", "stream": true }' { "region": "auto", "latency_ms": 38, "content": "Hello 👋 Any model, your region, one API." }
Figures sourced from the Plugsky data room and pending independent validation before external distribution.
Backed by a world-class ecosystem — NVIDIA, Microsoft, OpenAI, AWS, Google, PayPal, Meta, Shopify — across enterprise AI, fintech, government and SaaS.
Call 31+ models through one OpenAI-compatible API — flat pricing from $5, no per-token bills, switch from lite to frontier in a single line.
White-label models with flat, predictable pricing — built-in routing, auto-fallback, and team seats that protect your margins as you scale.
Private, sovereign deployment in your own cloud or region — SSO/SAML, audit logs, GDPR & PDPL alignment, and SLAs up to 99.99%.
Global platforms impose rate caps, unpredictable token costs, and zero control over where your data runs. Plugsky removes those limits for any team, anywhere.
Throughput caps and queueing throttle production AI products at the worst time — peak demand.
Pure token billing makes margins impossible to plan. SaaS teams need fixed, high-volume packages.
Sending sensitive data to third-party clouds you don't control is a non-starter for regulated teams.
Six core products, one OpenAI-compatible API, one bill, one account. Each stands on its own — but they click together the moment you turn them on.
Call 31+ models from one OpenAI-compatible API — lite to frontier, switch in a single line. Free, paid and embedding models side by side.
Build and run agents with function-calling, memory, and orchestration. OpenAI Assistants-compatible for one-line migration.
Open Agents →Wire agents into your tools, APIs and data sources. Bring-your-own Python functions, webhooks, HTTP actions — no glue code.
Browse tools →Private RAG over your own docs with 4096-dim embeddings. Upload PDFs, code, knowledge bases — chat against them in one call.
Try RAG →Automatically blends cheaper and stronger models per task — fan out to N models in parallel, pick the best, save 60-80% vs always-frontier.
Open Fusion →Discover and install ready-made agents, tools and prompt packs in a click. Free with every plan.
Browse marketplace →Three first-party apps, all open-source under MIT / Apache-2.0, all pre-configured to talk to the Plugsky API. Pick your platform — install with a single command.
Fork of opencode, rebranded 100% as Plugsky. macOS, Linux, Windows. MIT-licensed.
Built on the open-source Jan project (Apache-2.0). Branded Plugsky with plugsky-pro as the default model.
Built on the open-source Open WebUI project. Docker or pip install. plugsky-fusion pre-listed.
All installers are MIT-licensed. See NOTICE for upstream attribution. Full software reference →
31 first-party and partner models behind one OpenAI-compatible endpoint. Free, paid, vision, reasoning, code, embedding — pick by capability, switch in a line. Hover the ⓘ icon for full details on any model.
High-volume APIs, simple migration, usage dashboards.
Reserved throughput & white-label AI to protect margins.
Private endpoints with full audit controls.
Compliance-ready RAG, SSO/RBAC, audit logs.
Wholesale API + per-client isolated deployments.
Low-latency regional endpoints for high-volume support.
Frontier labs win on raw intelligence. Plugsky wins on deployment control, economics & freedom from lock-in.
| Dimension | Global APIs | Plugsky |
|---|---|---|
| Deployment control | Limited | Your cloud, our cloud, or on-prem |
| Pricing | Per-token, rate-limit surprise bills | Flat-rate packages — unlimited usage within fair-use, no per-token bills |
| Data residency | Plan-dependent | Any region you choose — UAE, KSA, EU, US, on-prem |
| White-label | Limited | Core capability — rebrand models, resell under your own brand |
| GPU capacity | Centrally owned | Hybrid: owned data center + GPU Share Network + partner clouds |
| Models | 1-2 first-party + paid add-ons | 31 first-party + partner models behind one OpenAI-compatible API |
| Agent Cloud | Beta, separate billing | Built-in, OpenAI Assistants-compatible, free on every plan |
| RAG (Knowledge) | Add-on, separate vendor | Built-in, 4096-dim embeddings, private collections, included free |
| Model Fusion | Not available | First-class — fan out, vote, merge, save 60-80% on cost |
| Marketplace | Not available | 100+ ready-made agents, tools, prompt packs |
| Compliance | Plan-dependent | GDPR, PDPL, ISO 27001, SOC 2 Type II, HIPAA-ready, 99.99% SLA |
| Languages | English-first | 50+ languages, Arabic dialects native, RTL UI |
| Integrations | A few SDKs | 58+ — every IDE, CLI, desktop, framework, no-code tool |
| Onboarding | Read the docs, figure it out | $5 trial, change base_url, ship in 5 minutes |
“Ministry cut citizen service response from 48 hours to 4 minutes. 500K+ inquiries/month. Satisfaction jumped 67% → 91%. Zero IT headcount added.”
Head of AI · Government“FinTech switched from AWS to Azure in 24 hours with zero code changes. 43% cost reduction. Expanded to 3 countries instantly.”
COO · FinTech“Hospital deployed HIPAA-compliant, air-gapped AI in 5 days. 50K+ records/month. Physicians save 8 hrs/week. Zero breaches.”
Director of IT · Healthcare“We hit regulatory roadblocks expanding to Saudi on AWS. Plugsky deployed identical LLM infrastructure across AWS (UAE), Azure (Saudi) and on-prem (Egypt) instantly. This level of freedom is unprecedented in enterprise AI.”
Jordan Lee · CFO, BrightwaveStart free, then pick the plan that fits. Launch special: 30% off monthly, 50% off yearly. Need volume, compliance, or sovereign deployment? See enterprise packages →
|
Self-serve
Free
Full access to test
|
Self-serve
New
Hobby
Generous limits
|
Self-serve
Most popular
Starter
Unlimited — fair use
|
Self-serve
Builder
Unlimited — fair use
|
Self-serve
Scale
Unlimited — fair use
|
Enterprise
Starter Enterprise
First enterprise pilot
|
Enterprise
Most popular
Growth Enterprise
Scaling AI internally
|
Enterprise
Enterprise
Regulated industries
|
Enterprise
Sovereign AI Cloud
Government & banks
|
|
|---|---|---|---|---|---|---|---|---|---|
| Price | Free | $8 $5.6/mo−30% launch |
$20 $14/mo−30% launch |
$60 $42/mo−30% launch |
$120 $84/mo−30% launch |
$15K – $25K/year | $50K – $100K/year | $150K – $500K+/year | $500K – $2M+/year |
| Billed annually (−50% off) | — | $40/yr −50% |
$8/yr −50% |
$24/yr −50% |
$48/yr −50% |
— | — | — | — |
| Usage | Unlimited* — all self-serve plans | By deployment — annual contract | |||||||
| Best for | Full access to test | Generous limits | Unlimited — fair use | Unlimited — fair use | Unlimited — fair use | First enterprise pilot | Teams scaling AI internally | Regulated companies | Government & banks |
| Models | plugsky-micro only | up to plugsky-plus | up to plugsky-pro | up to plugsky-max | all + plugsky-frontier | up to plugsky-frontier | all models | all + private models | all + custom fine-tuning |
| Deployments | 1 | 2 | 5 | 20 | Unlimited | 1 | Up to 5 | Unlimited | Unlimited + sovereign |
| Seats | 1 | 2 | 5 | 20 | 25 | 5 | 25 | Unlimited | Unlimited |
| API keys | 2 | 5 | 20 | Unlimited | Unlimited | Unlimited | Unlimited | Unlimited | Unlimited |
| Rate limit* | Low req/min | Higher req/min | 120 req/min | 300 req/min | 1,000 req/min | Custom | Custom | Dedicated | Sovereign dedicated |
| Knowledge (RAG) | Minimal docs | Small docs | 1,000 docs | 20,000 docs | 200,000 docs | 5,000 docs | 50,000 docs | Unlimited | Unlimited + private |
| Support | Community | Priority | Priority + onboarding | Email 48h SLA | Slack channel | Named engineer | 24/7 + dedicated CSM | ||
| SLA | — | — | — | — | 99.9% | 99.9% | 99.95% | 99.99% | Sovereign + custom |
| SSO / SAML | — | — | — | — | — | — | ✓ | ✓ | ✓ |
| Audit logs | — | — | — | — | ✓ | Basic | Full | Full + export | Full + SIEM export |
| On-prem / VPC | — | — | — | — | — | — | Optional | Optional | Air-gapped available |
| DPA & Compliance | — | — | — | — | — | DPA included | DPA + SOC 2 | DPA + SOC 2 + HIPAA | FedRAMP + sovereign |
| * Unlimited usage subject to fair-use rate limits. Enterprise packages are annual contracts, billed yearly. Custom configurations available — contact enterprise@plugsky.com for a tailored quote. | |||||||||
Setup, integration, RAG, fine-tuning, and migration support from our engineering team.
Plugsky operates the AI stack end-to-end — monitoring, optimization, model/runtime operations, and 24/7 support.
| Feature | Free $0 |
Hobby $5.6 / mo −30% |
Starter $14 / mo −30% |
Builder $42 / mo −30% |
Scale $84 / mo −30% |
|---|---|---|---|---|---|
| ⚡ Core platform | |||||
| All 31+ AI models (plugsky-micro → plugsky-frontier + Kimi, Qwen, DeepSeek, Llama, Mistral, NVIDIA Nemotron, Gemma) | ✓ | up to plus | up to pro | up to max | all 31+ + plugsky-frontier |
| OpenAI-compatible API | ✓ | ✓ | ✓ | ✓ | ✓ |
| Streaming & function calling | ✓ | ✓ | ✓ | ✓ | ✓ |
| Vision & multimodal | ✓ | ✓ | ✓ | ✓ | ✓ |
| JSON mode & structured outputs | ✓ | ✓ | ✓ | ✓ | ✓ |
| Embeddings (4096-dim, 100+ languages) | ✓ | ✓ | ✓ | ✓ | ✓ |
| Flux model (multi-model fan-out & routing) | — | ✓ | ✓ | ✓ | ✓ |
| Fair-use rate limit (req/min) | Low | Higher | 120 | 300 | 1,000 |
| API keys | 2 | 5 | 20 | Unlimited | Unlimited |
| Usage dashboard & analytics | ✓ | ✓ | ✓ advanced | ✓ custom | ✓ custom |
| 🎮 Playground Beta (in-browser chat) | |||||
| In-browser chat playground with all 31+ models | ✓ | ✓ | ✓ | ✓ | ✓ |
| In-browser file upload & RAG playground | ✓ | ✓ | ✓ | ✓ | ✓ |
| In-browser Flux fan-out playground | — | ✓ | ✓ | ✓ | ✓ |
| Local file storage (IndexedDB, 50 MB) | 10 MB | ✓ | ✓ | ✓ | ✓ |
| Tabbed chat history (in-browser) | — | ✓ | ✓ | ✓ | ✓ |
| Semantic search over your local files | — | ✓ | ✓ | ✓ | ✓ |
| Share chat tabs as a short URL | ✓ | ✓ | ✓ | ✓ | ✓ |
| 🔌 Plugins (function calling, webhooks, custom tools) | |||||
| Plugin sandbox & built-in plugins (calculator, web search, weather, stock, translator, code-run, SQL, image-gen) | ✓ | ✓ | ✓ | ✓ | ✓ |
| Custom webhooks (HTTP, function) | — | ✓ | ✓ | ✓ | ✓ |
| Bring-your-own Python plugin | — | — | ✓ | ✓ | ✓ |
| Plugin streaming (SSE, JSON events) | ✓ | ✓ | ✓ | ✓ | ✓ |
| Per-plugin auth scopes & rate limits | — | — | ✓ | ✓ | ✓ |
| 🤖 Agents (system prompt + tools bundles) | |||||
| Pin & unpin agents to chat (per-tab agents) | — | ✓ | ✓ | ✓ | ✓ |
| OpenAI Assistants-compatible agents | ✓ | ✓ | ✓ | ✓ | ✓ |
| Function calling & tools | ✓ | ✓ | ✓ | ✓ | ✓ |
| Conversation memory (across turns) | ✓ | ✓ | ✓ | ✓ | ✓ |
| Code interpreter (sandboxed Python execution) | — | ✓ | ✓ | ✓ | ✓ |
| File search & vector retrieval | — | ✓ | ✓ | ✓ | ✓ |
| Custom tools (BYO Python) | — | — | ✓ | ✓ | ✓ |
| Agent Studio (create, edit, pin agents) | ✓ | ✓ | ✓ | ✓ | ✓ |
| GPU Share Network (verified operators contribute idle GPUs) | — | — | — | ✓ | ✓ |
| 🛒 Marketplace (ready-made agents, tools, prompt packs) | |||||
| 100+ ready-made agents, tools, prompt packs | — | ✓ | ✓ | ✓ | ✓ |
| Browse, search, filter the marketplace | ✓ | ✓ | ✓ | ✓ | ✓ |
| One-click deploy to your workspace | — | ✓ | ✓ | ✓ | ✓ |
| Submit your own agents/tools to the marketplace | — | — | — | ✓ | ✓ |
| Marketplace earnings (70% revshare to creators) | — | — | — | — | ✓ |
| 📖 Knowledge (RAG) | |||||
| Document upload & chunking (PDF, DOCX, MD, code) | ✓ | ✓ | ✓ | ✓ | ✓ |
| Private collections | Minimal | Small | 1,000 | 20,000 | 200,000 |
| Multilingual embeddings (100+ languages) | ✓ | ✓ | ✓ | ✓ | ✓ |
| Hybrid search (vector + keyword) | — | ✓ | ✓ | ✓ | ✓ |
| Citations & source highlighting | — | ✓ | ✓ | ✓ | ✓ |
| Re-rank & query rewriting | — | — | — | ✓ | ✓ |
| 🔀 Model Fusion (multi-model fan-out) | |||||
| Sequential chains | — | ✓ | ✓ | ✓ | ✓ |
| Parallel fan-out (up to 8 models) | — | — | ✓ | ✓ | ✓ |
| Cost-saver auto-escalation (micro → pro → max) | — | ✓ | ✓ | ✓ | ✓ |
| Majority vote & merge strategies | — | — | ✓ | ✓ | ✓ |
| Custom routing rules | — | — | — | ✓ | ✓ |
| Flux model preset library | — | ✓ | ✓ | ✓ | ✓ |
| 🌍 Integrations & SDKs | |||||
| 58+ client integrations (IDEs, CLIs, desktop, frameworks) | ✓ | ✓ | ✓ | ✓ | ✓ |
| plugsky CLI & Desktop (Jan, Open WebUI, Chatbox, Msty) | ✓ | ✓ | ✓ | ✓ | ✓ |
| IDEs (VS Code, JetBrains, Cursor, Cline, Continue) | ✓ | ✓ | ✓ | ✓ | ✓ |
| CLIs (plugsky CLI, Aider, Goose, OpenCode, Crush) | ✓ | ✓ | ✓ | ✓ | ✓ |
| Frameworks (LangChain, LlamaIndex, Haystack, CrewAI, PydanticAI) | ✓ | ✓ | ✓ | ✓ | ✓ |
| No-code (Flowise, Langflow, Dify, Activepieces, Retool) | ✓ | ✓ | ✓ | ✓ | ✓ |
| Bring-your-own models (custom fine-tunes) | — | — | — | ✓ | ✓ |
| 👥 Team & security | |||||
| Team seats | 1 | 2 | 5 | 20 | 25 |
| Role-based access control (RBAC) | — | — | ✓ | ✓ | ✓ |
| Audit logs (workspace-level) | — | 30 days | 30 days | unlimited | unlimited |
| SSO / SAML | — | — | — | — | ✓ |
| GDPR / PDPL compliance | ✓ | ✓ | ✓ | ✓ | ✓ |
| SOC 2 Type II / ISO 27001 | — | — | — | — | ✓ |
| Uptime SLA | — | — | 99.9% | 99.95% | 99.99% |
| Priority email support | Community | Priority | Priority + onboarding | ||
| 💳 Billing | |||||
| Flat monthly fee (no per-token) | ✓ | ✓ | ✓ | ✓ | ✓ |
| Annual billing (save 50% with launch promo) | — | ✓ | ✓ | ✓ | ✓ |
| No overage fees | ✓ | ✓ | ✓ | ✓ | ✓ |
| Cancel anytime | ✓ | ✓ | ✓ | ✓ | ✓ |
| Discount coupons & promo codes | ✓ | ✓ | ✓ | ✓ | ✓ |
| Affiliate / referral program | ✓ | ✓ | ✓ | ✓ | ✓ |
Need an enterprise or sovereign-cloud plan? Contact our team → — every enterprise contract is priced to your deployment and security requirements.
Data residency, compliance and the case for running inference where you choose.
EngineeringSwitch your base URL, keep your stack — OpenAI-compatible by design.
ProductBuild agents with tools, memory and private RAG on infrastructure you control.
The 20 questions customers ask most before signing up.
Yes. There are no per-token charges on any plan. Every plan is flat-rate, and usage is gated only by the fair-use rate limit for your tier (60, 120, 300, or 1,000 req/min). You can call a million times on a $20 plan and still pay $20.
Change your base_url to https://api.plugsky.com/v1, set your API key, and use model="plugsky-pro" instead of gpt-4o. Your existing code, SDK, and prompts work unchanged. Most teams migrate in 5 minutes.
31 first-party models (plugsky-micro through plugsky-frontier, plus Kimi, Qwen, DeepSeek, Llama, Mistral, NVIDIA Nemotron, Gemma) and the same OpenAI-compatible endpoint also supports partner models. Free, paid, vision, reasoning, code, embeddings — all behind one API.
Yes. Switch from AWS to Azure to Google Cloud to on-prem without rebuilding your infrastructure. That is the core benefit of Plugsky — no vendor lock-in, ever.
Yes — Enterprise includes on-premise and air-gapped deployment for maximum data control. Runs in your own data center, on your own hardware, with no outbound network calls.
Model Fusion lets you call multiple models in parallel and pick the best result. Set model="plugsky-fusion" and Plugsky runs your default chain (e.g. micro → pro → max) — fast and cheap, with auto-escalation on hard prompts. Saves 60-80% on production traffic.
Agent Cloud is our platform for building AI agents with function-calling, memory, tools, and orchestration. OpenAI Assistants-compatible for one-line migration. Free on every plan.
Private RAG over your own documents. Upload PDFs, code, or knowledge bases; Plugsky chunks them, embeds with 4096-dim vectors, and lets you chat against your data in one API call. No separate vector DB needed.
100+ ready-made agents, tools, and prompt packs. Browse in the dashboard, click "Deploy", and they're live in your workspace. Free with every plan.
58+ — every major IDE (VS Code, JetBrains, Cursor, Cline, Continue), CLI (plugsky CLI, Aider, Goose, OpenCode, Crush), desktop (Jan, Open WebUI, Chatbox, Jan, Msty, LM Studio), framework (LangChain, LlamaIndex, Haystack, CrewAI, PydanticAI, Semantic Kernel), and no-code (Flowise, Langflow, Dify, Activepieces, Pipedream, Retool).
ISO 27001, SOC 2 Type II, GDPR, PDPL-compliant, HIPAA-ready. FedRAMP in progress. Enterprise plans include a signed DPA, BAA, and full audit log export.
Self-serve plans: under 5 minutes. Enterprise private cloud: 1-3 days. On-premise: 3-7 days. White-glove migration with a named engineer: included in Enterprise.
Standard SLA 99.9%, Enhanced 99.95%, Custom up to 99.99% with financial penalties for downtime. The /status page is the public source of truth for live uptime.
Yes. Enterprise customers can negotiate custom terms, payment schedules, SLAs, regional deployment, and dedicated support engineers.
You pick the region — UAE-1, UAE-2, KSA, EU, US, or on-prem. Data never leaves the region you select. Full audit log of every API call is stored in the same region.
Yes. The Enterprise plan supports custom model uploads (fine-tuned or open-weights) served from your own infrastructure, behind the same OpenAI-compatible API. White-label option included.
Flat monthly or annual fee. No per-token charges, no per-request charges, no overage fees. You can upgrade, downgrade, or cancel anytime. Annual plans save 20%.
7 days of full access to all 31 models, Agent Cloud, RAG, Model Fusion, marketplace, and all integrations. No credit card required. At the end, pick a plan or your account goes read-only (we never auto-charge).
Verified operators contribute idle GPUs and earn 70% of the revenue Plugsky routes to them. Plugsky handles auth, metering, and routing. Non-sensitive workloads only — every payload is classified before routing.
Free: documentation, community Discord, GitHub issues. Pro: priority email, 24h response. Enterprise: named engineer, 4h SLA, Slack/Teams/phone. SLA: 99.9% uptime or you get credit.
Free to start. OpenAI-compatible. Live in minutes.