About us
VriTimes
Philippines
Philippine's Best Press Release Distribution Service
press release

/ OrcaRouter Releases AI Threat Report 2026 and Makes Its Security Controls Free Amid Rise in Prompt-Injection Attacks

OrcaRouter Releases AI Threat Report 2026 and Makes Its Security Controls Free Amid Rise in Prompt-Injection Attacks

CONTINUUM AI PTE. LTD.
Share
OrcaRouter has published The AI Threat Report 2026 and made its agent Firewall and input/output Guardrails free for every user — same API key, one switch, no code changes. The report argues that AI systems have become the attack surface, with prompt injection now the #1 risk to LLM applications and one that cannot be patched. OrcaRouter's answer is architectural: gateway-level controls that bind to credentials, so any team can enforce them without rewriting their agents.
preview

Prompt injection ranks as the top risk to LLM applications and, the company says, cannot be fully patched. OrcaRouter Security Research has made its agent Firewall and input/output Guardrails available at no cost to all users, attached to an existing API key.

SINGAPORE — June 18, 2026 — OrcaRouter, the OpenAI-compatible LLM gateway, today published The AI Threat Report 2026 and made two of its security controls available at no cost to all users: the agent Firewall and input/output Guardrails. According to the company, the controls can be attached to an API key already in use, without a separate integration or purchase.

The AI Threat Report 2026 — 14 key risks across four threat categories.

The report states that AI systems have themselves become an attack surface, and that most organizations cannot see the attacks directed against them. Telemetry from production LLM applications shows the average successful attack completing in 42 seconds, with 90% of them leaking sensitive data (Pillar Security). Prompt-injection attacks rose 340% year over year (OWASP, Q1 2026). And 13% of organizations have already been breached through an AI model or application — 97% of those lacked basic AI access controls (IBM, 2025).

By OrcaRouter Security Research · June 2026

In June 2025, attackers exfiltrated corporate data from Microsoft 365 Copilot. The victim did nothing wrong — no link clicked, no attachment opened, no prompt approved. They received an email. Their AI assistant later read it, and obeyed the instructions hidden inside. Disclosed by Aim Security as EchoLeak (CVE-2025-32711), the attack gathered sensitive context from mail, files, and chat history and smuggled it out through an auto-loading image URL. Zero clicks.

According to the report, EchoLeak was not an isolated case but an early example of a broader pattern.

A year of escalating, increasingly automated incidents

The report's 2026 incident record spans cases that challenged longstanding assumptions in enterprise security:

•     Chat & Ask AI left roughly 300 million private chat messages from more than 25 million users exposed through a Firebase misconfiguration (404 Media; Malwarebytes, Jan 2026).

•     Sears Home Services exposed 3.7 million AI chat transcripts and call recordings — names, addresses, emails — spanning 2024–2026 (ExpressVPN; Cybernews, Mar 2026).

•     An attacker chained a single CVE (CVE-2026-39987 in the marimo notebook tool) into a live LLM agent that extracted cloud credentials, pulled an SSH key from AWS Secrets Manager, and exfiltrated an entire internal PostgreSQL database in under two minutes (Sysdig; The Hacker News, May 2026).

•     Microsoft and Salesforce both shipped patches for AI-agent data-leak flaws. In CVE-2026-21520, a poisoned SharePoint field steered Copilot into emailing customer data to an attacker — and the data left even after a safety mechanism flagged the attack (Dark Reading).

•     Denial-of-wallet — a hijacked or runaway agent that simply spends — has been observed burning $46,000 a day (Sysdig, “LLMjacking”). No data is stolen. There is only a bill.

Three years of public incidents, research, and regulation — 2023 to 2026.

Why traditional security tools miss these attacks

Traditional security assumes a boundary: trusted inside, untrusted outside, controls at the seam. Language models dissolve that boundary, because a model's input is also its programming. Every email, document, web page, and tool result an agent reads can carry instructions it will follow. There is no reliable, general mechanism by which today's models separate content to process from commands to obey.

That is why prompt injection holds the #1 position in the OWASP Top 10 for LLM Applications — and why, the company argues, it will not be “patched” the way a buffer overflow is. It is described as a structural property of the medium: a web application firewall inspects the request and sees a perfectly valid API call, because the attack is in the words. Per-request checks pass every step of a chained attack, because the damage lives in the sequence — volume, repetition, and spend against time — not in any one call.

The report concludes that AI security is not a model-training problem. It is an architecture problem — and it is solvable with the same discipline enterprises already apply to every other production system.

The 14 key risks across four threat categories: content plane, action plane, economic, and trust & supply chain.

A gateway-level approach: two planes, six layers

Every attack above succeeds against unscoped authority and fails against scoped, policed, audited authority. Containing them requires controlling two distinct planes:

•     The content plane — what the model reads and writes. This is the job of Guardrails.

•     The action plane — what the agent does: the tools it calls, the networks it reaches, the money it spends. This is the job of the Firewall.

The report notes that the most damaging incidents cross both planes: an injection arrives as content, then executes as an action. OrcaRouter's design places six independent, auditable layers between a request and its execution:

•     Scoped identity — every agent calls through its own key carrying allowed models, an IP allow-list, a hard spend cap, and an expiry. An out-of-scope request dies before any content is read.

•     Input guardrails — injection and jailbreak rules, PII detection and masking, secret blocking, and a semantic LLM-judge that catches what regex cannot.

•     The action firewall — every tool call, MCP dispatch, and network egress is judged against ordered, default-deny policy with six verdicts: allow, audit, deny, sanitize, pending-approval, and cap-cost. A hijacked agent cannot reach a tool, a host, or a spend limit that was not explicitly listed.

•     Output guardrails — the reply is screened on the way out for unsafe output, PII, and secrets, with grounding checks. This is the layer that catches EchoLeak's exfiltration URL before it leaves.

•     Anomaly detection — behavioral baselines flag what static rules can't predict: the same call hammered in a tight window, spend spiking against a learned baseline, a tool-to-tool transition the workspace has never made.

•     Signed audit — every match, verdict, approval, and policy change lands in a tamper-evident trail, correlated by agent run and session, exportable as evidence.

The decisive property is placement. These controls live at the gateway, in the request path, so they bind to credentials rather than application code — enforceable across every team and framework, with no agent rewrites.

Observed prevalence versus potential business impact, mapped by threat plane.

Evaluation against open red-team benchmarks

The company says Guardrails and Firewall ship with an evaluation harness that scores them against more than 80 open-source red-team corpora, each cited and licensed:

•     HarmBench (MIT; ICML 2024), JailbreakBench (NeurIPS 2024), and AdvBench (Zou et al., 2023) for harmful-behavior and jailbreak robustness;

•     NVIDIA's garak (Apache-2.0), the open LLM vulnerability scanner, for injection and encoding attacks;

•     AgentDojo (NeurIPS 2024) — the agent prompt-injection benchmark the US and UK AI Safety Institutes used in joint red-teaming — to grade the action-plane firewall specifically;

•     TruthfulQA and others for grounding and hallucination.

OrcaRouter integrates open tooling directly: OSV for dependency CVEs and Semgrep for code that transits a prompt.

Aligning with incoming regulation

On August 2, 2026, the EU AI Act becomes fully applicable, and “show me” replaces “tell me” as the regulatory baseline. The same evidentiary instinct is spreading through SOC 2 scopes, cyber-insurance questionnaires, and procurement reviews. OrcaRouter ships 36 compliance framework packs — including OWASP LLM Top 10, NIST AI RMF, ISO/IEC 42001, EU AI Act, SOC 2, HIPAA, PCI DSS, and GDPR — that apply controls within a workspace and generate signed evidence. According to the company, one control layer can produce attestation for all of them at once.

What is being released

OrcaRouter Firewall + Guardrails are now free for every user. The controls attach to an API key already in use and do not require a separate integration.

The company said it made the controls free deliberately, citing the report's finding that restricting AI use without an approved alternative tends to increase unsanctioned, or “shadow,” AI rather than reduce it — and that shadow AI already drives one in five breaches at a $670,000 premium (IBM, 2025). The company argues that the response is as much economic as technical: make the governed path the easiest path. A control that carries an extra cost, requires manual integration, and must be justified to a budget committee is, it says, one that many teams will skip.

Guardrails and a Firewall policy attach to an existing key, and the company recommends a staged rollout: observe (run in audit mode and let real traffic write the baseline), shadow (run the real policy in would-block mode until false positives approach zero), then enforce (flip verdicts live, with human approval reserved for the genuinely irreversible). Most teams convert in weeks — and keep the controls on.

Outlook

The report frames the 2026 threat landscape not as a reason to slow AI adoption but as a guide to managing it. Its central argument is that the documented attacks succeed against unscoped authority and fail against scoped, policed, and audited authority — a property the company says can be implemented at the gateway level.

Availability: The Firewall and Guardrails are available now to all OrcaRouter users. The AI Threat Report 2026 is published on the OrcaRouter documentation site.

 

About OrcaRouter

OrcaRouter is an OpenAI-compatible LLM gateway from Continuum AI Pte. Ltd. (Singapore), routing across 200+ models with around 40% cost reduction, sub-millisecond routing overhead, and zero token markup. A self-hosted edition, OrcaRouter-Lite, is available under the MIT license.

Media contact: Yi Shi · yi@continuum01.ai

About CONTINUUM AI PTE. LTD.
OrcaRouter, built by Singapore-based Continuum AI Pte. Ltd., has launched Routing DSL, a programmable routing framework positioned as a Fable 5 alternative for developers affected by recent export-control restrictions on Claude Fable 5. Instead of replacing one model with another, Routing DSL orchestrates 200+ models with YAML and CEL expressions — routing by prompt complexity, task type, latency, cost, and safety policy, running models in parallel and merging the best result. Built into OrcaRouter's OpenAI-compatible AI Gateway, it offers roughly 40% lower cost, sub-1ms routing overhead, and no token markup, with early internal evaluations suggesting it can reach Fable 5–level performance at a fraction of the cost. Available now to all OrcaRouter users.
Contact
Yi Shi yi@continuum01.ai
Ready to try VRITIMES?
VRITIMES is a press release distribution platform used by 5,000+ companies. Distribution starts from PHP 1,790 with guaranteed publication in 50 media.