--- title: agent-harness emoji: 🛡️ colorFrom: blue colorTo: red sdk: docker pinned: false license: mit short_description: FORGE Safety - Injection + Circuit Breaker tags: - docker - forge - agent - safety - injection - rlhf - mcp --- # agent-harness — FORGE Safety Layer Middleware sitting between agents and the outside world. Every tool input is scanned before execution. Every tool result is sanitised before the LLM sees it. Suspicious agents get circuit-broken and christof gets alerted. ## Safety Layers | Layer | What it does | |-------|-------------| | 1 — Pattern scanner | 15 injection regex patterns (instruction override, jailbreak, DAN, end-of-prompt markers) | | 2 — Token scanner | LLM special tokens: `<\|im_start\|>`, `[INST]`, `<>`, Phi tokens | | 3 — Bash danger | Destructive shell commands in vault_exec payloads: rm -rf, fork bomb, dd wipe, crontab exfil | | 4 — Tool policy | Per-agent tool allowlist/blocklist loaded from agent-forge | | 5 — Rate limiter | Per-agent per-tool N calls/60s (in-memory, fast) | | 6 — Circuit breaker | 3 high/critical flags in 60s → pause agent → Telegram alert to christof | | 7 — Output sanitiser | Strip injected instructions from tool results before LLM re-ingests them | ## API ``` POST /api/scan/input Scan tool args before execution POST /api/scan/output Sanitise tool result before LLM sees it POST /api/validate Pre-flight: circuit + policy + rate (no scan) GET /api/flags Recent flagged events GET /api/circuit/{agent} Circuit breaker state for agent POST /api/circuit/{agent}/reset Reset paused agent after review GET /api/circuit All circuit states GET /api/rates Rate counter state (60s window) GET /api/patterns All active detection patterns POST /api/test/scan Test scan (no logging, no RLHF) GET /api/stats Aggregated stats GET /api/health Health + tripped agents GET /mcp/sse MCP SSE stream POST /mcp MCP JSON-RPC ``` ## HF Secrets | Secret | Value | |--------|-------| | `TRACE_URL` | `https://chris4k-agent-trace.hf.space` | | `LEARN_URL` | `https://chris4k-agent-learn.hf.space` | | `RELAY_URL` | `https://chris4k-agent-relay.hf.space` | | `FORGE_URL` | `https://chris4k-agent-forge.hf.space` | | `HARNESS_KEY` | random string | ## RLHF Penalties | Severity | Penalty | |----------|---------| | low | -0.5 | | medium | -1.0 | | high | -2.0 | | critical | -4.0 | ## MCP Config ```json { "mcpServers": { "harness": { "command": "npx", "args": ["-y", "mcp-remote", "https://chris4k-agent-harness.hf.space/mcp/sse"] } } } ```