openclaw-language-boundary
OCAL-style language-boundary safety plugin for OpenClaw.
Status
Stable MVP. Default hardening phase is observe, with progressive recommendations toward guided, strict, and force after clean audit windows.
What it does
- Builds a conservative Action IR before each tool call
- Classifies effect / target / risk without relying on the model
- Applies default policy rules
- Tracks tool failure state
- Writes redacted audit JSONL
- Adds observe-mode checks for outbound messages, installs, subagent spawning, and cron lifecycle
- Supports progressive hardening phases: observe → guided → strict → force
- Can block or request approval when the active hardening phase maps to enforce mode
First-use config
Generate a starter config:
npm run typecheck
npm test
npm run build
npm run release:check
npm run init-config -- --mode=observe
npm run init-config -- --mode=guided
npm run init-config -- --mode=strict
npm run init-config -- --mode=force
npm run release:check runs the read-only internal gate: typecheck, tests, build, required docs, example config JSON validation, generated init-config validation, local/private value leak scan, and dist entry verification.
partial-enforce is still accepted as an alias for guided.
The generated config uses placeholders only, e.g. /path/to/workspace; replace them with your own workspace and production data roots.
Example configs are available in examples/:
minimal-config.jsonobserve-mode.jsonpartial-enforce.json(guidedphase)force-mode.json
Progressive hardening
The plugin separates policy mode from runtime state:
policy_mode:observeorenforce— whether blocking decisions are active.hardening_phase:observe,guided,strict, orforce— product-facing safety maturity phase.runtime_state:normal,degraded,failure_loop, etc. — current substrate/tool health.
Recommended rollout:
observefor 7-30 days: record Action IR and audit logs without broad blocking.guidedafter a clean audit window: enforce the vetted high-risk rule set.strictafter guided stays clean: increase approval coverage for sensitive environments.forceonly by explicit user/admin confirmation: production/compliance mode.
By default autoAdvance is "off". If enabled as "guided-only", the plugin may advance only from clean observe to guided; it never auto-enters strict or force.
Safety defaults
- Unknown tools require approval
- Exec requires approval
- External write requires approval
- Config/credential targets require strong approval
- Hook errors fail closed for high-risk actions
- Audit logs redact sensitive-looking values
Calibration notes
After initial observe-mode calibration, these OpenClaw core tools are classified as low-risk/read-like when used safely:
processwithlist/poll/logupdate_plansessions_yieldsubagentswithlist
Riskier variants stay high risk, e.g. process.kill, process.write, subagents.steer, and exec.
Next steps
- Keep
mode: "observe"as the project default. - Use partial enforce only after a clean audit window.
- Keep calibration focused on false positives in runtime config, shell diagnostics, workspace artifact cleanup, and external writes.
Mainline closeout
- The runtime reliability mainline is stable and usable.
- The language-boundary mainline is now in calibration / maintenance mode, not active incident response.
- Continue only if future audit logs show new false positives or a new boundary gap.
Reusability requirements
This plugin is intended to be released for other OpenClaw users, not just this machine.
- No hard-coded user paths such as
/Users/<name>or machine-specific workspace paths. - No dependency on a specific agent/session/machine name.
- No dependency on a specific channel, provider account, local credential, cron job, or local extension layout.
- Host-specific paths or providers may appear only as examples or local validation notes, never runtime defaults.
- Path classification must stay config-driven with safe defaults.
- Audit/state paths must be configurable or derived from OpenClaw/home directory.
- Default behavior must remain conservative:
observefirst, partial enforce only after calibration.