ZeroGPU Router
Cut your OpenClaw agent's inference costs. Route trivial AI tasks — summarize, classify, redact PII, extract JSON, generate follow-ups, short chat — to small/nano models instead of burning frontier-model tokens.
What it does
Your OpenClaw agent keeps doing the heavy reasoning. Routine tasks get offloaded to ZeroGPU's small models — typically 100–1000× cheaper per call.
- 11 task-specific MCP tools (
zerogpu_summarize,zerogpu_classify_iab,zerogpu_redact_pii, …) - A routing skill that teaches your agent when to call each tool
- Per-call savings logged with model, latency, and a real
savings_usdfigure - Hosted MCP server at
https://mcp.zerogpu.ai/mcp— no infra to run
Install
openclaw plugins install clawhub:zerogpu-router
Set up (2 steps)
1. Get your API key
Sign in at platform.zerogpu.ai and create a project to grab an API key + project ID.
2. Register the hosted MCP server
In your OpenClaw shell:
openclaw mcp set zerogpu '{
"url": "https://mcp.zerogpu.ai/mcp",
"transport": "streamable-http",
"headers": {
"x-api-key": "<your-api-key>",
"x-project-id": "<your-project-id>"
}
}'
Verify:
openclaw mcp show zerogpu --json
That's it. Your agent will now route trivial tasks through ZeroGPU automatically.
Try it
Ask your agent:
summarize this paragraph: Renewable energy adoption is accelerating globally, driven by falling solar and wind costs.
The agent calls zerogpu_summarize (running on t5-small) instead of the host model and replies with the summary plus a savings line.
The 11 tools you get
| Tool | Workload | Backing model |
|---|---|---|
zerogpu_classify_iab | IAB topic classification | zlm-v1-iab-classify-edge |
zerogpu_summarize | TL;DRs, abstracts, meeting summaries | t5-small |
zerogpu_classify_zero_shot | Classify against a flat label list | deberta-v3-small |
zerogpu_extract_entities | People, places, companies, dates, custom entities | gliner2-base-v1 |
zerogpu_extract_json | Pull structured fields into grouped JSON | gliner2-base-v1 |
zerogpu_classify_structured | Multi-axis schema classification | gliner2-base-v1 |
zerogpu_redact_pii | Mask emails, phones, names, addresses, other PII | gliner-multi-pii-v1 |
zerogpu_extract_pii | Extract PII grouped by category | gliner-multi-pii-v1 |
zerogpu_generate_followups | Generate follow-up questions from a passage | zlm-v1-followup-questions-edge |
zerogpu_chat | Short small-model chat replies | LFM2.5-1.2B-Instruct / -Thinking |
zerogpu_health | Verify the ZeroGPU backend | — |
Every tool returns { <task fields>, model, usage, savings }.
Watch your savings
Live dashboard at platform.zerogpu.ai — token usage, latency, per-tool savings, broken down by agent and time range.
Links
- Website: https://zerogpu.ai
- Dashboard: https://platform.zerogpu.ai
- Source: https://github.com/zerogpu/zerogpu-router
- Full setup guide: https://github.com/zerogpu/zerogpu-router/tree/main/agents/openclaw
- Issues / contact: hello@zerogpu.ai
License
MIT — see LICENSE.