ZeroClaw is a lightweight, fast personal AI agent designed for autonomous background operation. Because it runs unattended, you need monitoring even more than with interactive agents — Cognisafe gives you real-time alerting when ZeroClaw is compromised, before any damage is done.
Why this matters
ZeroClaw is built to run in the background, autonomously handling tasks without you watching every move. That autonomy is exactly what makes security non-negotiable — a compromised ZeroClaw agent can act on injected instructions for hours before you notice. Cognisafe catches threats in real time and alerts you the moment something goes wrong.
Silent prompt injection via data sources
ZeroClaw reads emails, RSS feeds, or documents autonomously. Malicious content embedded in those sources injects instructions that redirect your agent without any human in the loop to notice.
PII leakage in automated outputs
ZeroClaw processes personal data from your inbox or files and includes it verbatim in summaries, reports, or API calls — without you being present to catch the leak.
System prompt extraction
An injected instruction tricks ZeroClaw into revealing the system prompt that defines its permissions and behaviour — giving an attacker a map of exactly what your agent can do.
Jailbreak while unattended
A jailbreak succeeds while ZeroClaw is running overnight. Without monitoring, you only discover it the next morning when the damage — sent emails, modified files — is already done.
Quickstart
Cognisafe wraps ZeroClaw's LLM provider client so every call is captured, scored, and — if you have alerts configured — notifies you in real time.
pip install cognisafe
Add this to your ZeroClaw startup or configuration file, before the agent initialises:
import cognisafe
cognisafe.configure(
api_key="csk_...", # from cognisafe.uk/dashboard/settings
project_id="zeroclaw-home", # name this agent deployment
)
cognisafe.patch_openai() # wraps the OpenAI client ZeroClaw uses
# or: cognisafe.patch_anthropic() if using the Anthropic backendSince ZeroClaw runs unattended, configure email or Slack alerts so you're notified immediately when a threat is detected — don't wait until you check the dashboard:
# In your Cognisafe dashboard → Settings → Alerts: # ✓ Enable email alerts → your@email.com # ✓ Enable Slack alerts → your webhook URL # ✓ Alert on: all scorers (content safety, PII, jailbreak, system prompt)
Alerting
ZeroClaw runs in the background while you're away. Cognisafe sends you an alert within seconds of a threat being detected — before the agent has a chance to act on the injected instruction. Configure separate API keys per task so you know exactly which ZeroClaw job was targeted.
# Different keys for different ZeroClaw jobs:
# csk_email_key → "Email Monitor" job
# csk_files_key → "File Organiser" job
# csk_feeds_key → "RSS Summariser" job
# Each gets its own threat feed, usage counter,
# and alert channel in your Cognisafe dashboard.
cognisafe.configure(
api_key="csk_email_key",
project_id="zeroclaw",
agent_name="email-monitor", # shown in the Agent column
)
cognisafe.patch_openai()Get notified within seconds of a threat — critical for unattended agents where every minute of compromised operation matters.
Tag each ZeroClaw job with a unique key and agent name — see whether the attack came via email, files, or feeds.
Every LLM call captured with the full prompt and response — so you can see exactly what ZeroClaw said and did after an injection.
Jailbreak (LLM01), PII (LLM02), content safety (LLM05), and system prompt extraction (LLM07) scored on every call.
Also using OpenClaw?
ZeroClaw and OpenClaw share a similar architecture. The same Cognisafe integration works for both — you can monitor multiple agents from the same dashboard by giving each its own API key and project ID. See the OpenClaw integration guide →
Free tier, no credit card. 1,000 requests per month — enough to monitor an active autonomous agent running throughout the day.