Arc Gate —LLM proxy that hits P=1.00 R=1.00 F1=1.00 on indirect/roleplay prompt injection (beats OpenAI Moderation and LlamaGuard)

Apr 28, 2026 · 5:29 PM UTC · 0 reactions · 0 comments · 9 views

via

Artificial Intelligence (AI)

Benchmarked on 40 out-of-distribution prompts, indirect requests, roleplay framings, hypothetical scenarios, technical phrasings. The stuff that slips past everything else. Arc Gate: P=1.00, R=1.00, F1=1.00 OpenAI Moderation API: P=1.00, R=0.75, F1=0.86 LlamaGuard 3 8B: P=1.00, R=0.55, F1=0.71 Zero false positives. Zero misses. Blocked prompts average 329ms and never reach your model. Detection overhead is ~350ms on top of your normal upstream latency. Sits in front of any OpenAI-compatible endp

Original article

Artificial Intelligence (AI)

Read full at Artificial Intelligence (AI) →

Anonymous · no account needed

Discussion

0 comments

Arc Gate —LLM proxy that hits P=1.00 R=1.00 F1=1.00 on indirect/roleplay prompt injection (beats OpenAI Moderation and LlamaGuard)

Discussion

More from Artificial Intelligence (AI)