FAQ
Answers to the most common questions about data storage, security, latency, billing, and how Cognisafe works under the hood. Can't find what you need? hello@cognisafe.uk
On the managed cloud plan, data is stored in our PostgreSQL (TimescaleDB) database hosted in the EU (Ireland). On self-hosted plans, data never leaves your infrastructure — you own the database, the proxy, and the API. Nothing is sent to Cognisafe servers.
Retention depends on your plan: Free (7 days), Starter (30 days), Professional (90 days), Business (1 year). On Enterprise plans you can configure a custom retention window. Automatic purge jobs run nightly and are cryptographically logged. You can also trigger a manual purge from the dashboard at any time.
Yes — capturing the full request and response body is how Cognisafe can score safety, detect PII, and give you a complete audit trail. On self-hosted deployments, this data never leaves your network. On managed cloud, all data is encrypted at rest (AES-256-GCM) and in transit (TLS 1.2+). You can configure field-level masking to redact sensitive fields before they reach the database.
Yes. We are a UK-registered company operating under UK GDPR. Data minimisation, right to erasure, and data portability are all supported. EU customers on managed cloud have data stored in EU-region infrastructure only. Self-hosted deployments are wholly within your own jurisdiction. We can provide a Data Processing Agreement (DPA) on request.
No. The proxy is designed to fail open by default — if the Cognisafe API is unreachable when logging a request, the proxy still forwards the call to the upstream LLM and returns the response to your app. Safety scoring jobs are queued in Redis and processed when the worker comes back online. You can configure fail-closed (block mode) on Business and Enterprise plans if your use case requires it.
Safety scoring is fully asynchronous and never sits on the hot path — it queues to Redis and is processed by background workers. The Go reverse proxy itself is designed for minimal overhead; in our internal benchmarks it stays well under typical network jitter (a fraction of a millisecond in steady state). We're working on publishing reproducible benchmarks alongside the proxy source.
On self-hosted deployments you run as many proxy replicas as you need behind a load balancer. On managed cloud, the proxy runs in a multi-region active-active configuration. The proxy holds no state — all state is in PostgreSQL and Redis, which are both operated with high availability configurations.
The proxy core and Python SDK are open source (Apache 2.0). The safety scoring workers, dashboard, and API are source-available — you can read and audit the code but commercial use requires a license. Enterprise customers can request a full source code review under NDA.
Under 5 minutes for the managed cloud plan. Install the SDK (pip install cognisafe or npm install cognisafe), call configure(), call your provider patch (patch_openai(), patch_anthropic(), etc.), and every LLM call is now captured. No infrastructure changes required.
No. Cognisafe intercepts at the SDK level — it wraps your existing OpenAI, Anthropic, Mistral, or Cohere client transparently. Your agent framework (LangGraph, CrewAI, AutoGen, Semantic Kernel, etc.) continues to work exactly as before. The only change is three lines of SDK setup code.
Yes. Any OpenAI-compatible API is supported. Set UPSTREAM_URL to your vLLM, Ollama, NVIDIA NIM, HF TGI, or LM Studio server and Cognisafe proxies all traffic through. No model-specific configuration required.
Yes. Point your LLM client's base_url at the Cognisafe proxy (https://proxy.cognisafe.uk/v1) and include your Cognisafe API key in the X-Cognisafe-Key header. The SDK is optional — it adds agent-level attribution and metadata tagging, but all traffic through the proxy is captured and scored regardless.
One request = one LLM API call proxied through Cognisafe. If your agent makes 5 tool calls that each involve an LLM call, that counts as 5 requests. Streaming responses (SSE) count as one request regardless of how many tokens are streamed.
The proxy returns HTTP 429 (Too Many Requests) on calls that exceed your plan limit. Existing in-flight requests complete normally. You can upgrade your plan instantly from the dashboard. On Enterprise plans there is no hard limit — usage is metered and billed at an agreed overage rate.
Yes. All paid plans include the option to self-host. Your dashboard and billing are managed via cognisafe.uk; the proxy, API, and worker run inside your own infrastructure. Enterprise plans include a dedicated onboarding session and infrastructure review.
Yes — annual billing saves 20% compared to monthly. Switch to annual billing any time from your account settings. The discount is applied immediately on the next billing cycle.
After the proxy forwards an LLM call and returns the response to your app, it asynchronously queues a scoring job. The safety worker picks up the job from Redis and runs PyRIT-based scorers (content safety, PII detection, jailbreak detection, and the wider OWASP LLM Top 10 set) against the request/response pair. Results are written to the database and appear in your dashboard within seconds. Scoring never sits on the hot path of your application.
Yes, on Professional and above. You can define LLM-as-judge scorers (write a prompt template and a severity threshold), regex scorers (pattern matching on request/response text), and keyword list scorers (flag on specific terms). Custom scorers run alongside built-in PyRIT scorers asynchronously.
Yes — set the proxy to block mode. In block mode, the proxy evaluates synchronous scorers (keyword and regex) before forwarding the request. If a scorer fires above a configured severity threshold, the proxy returns a 403 and the call never reaches the LLM. LLM-as-judge scorers always run asynchronously due to their latency — they cannot block in real time.
Talk to us directly — we typically respond within a few hours.