PromptIntent-v2: evidence-based AI-security decisions
Most "AI firewalls" answer a yes/no question with a black box. PromptIntent-v2 answers a 7-way intent question with a deterministic detector and a calibrated ML model — then treats its own output as a claim to be measured, not a verdict to be trusted. It ships today in Shadow mode: it watches, scores, and proves itself before it is ever allowed to act.
Two engines, one decision
Deterministic rules for what we can prove; a calibrated model for what we can only infer.
A prompt is not a file. You cannot hash it and match a signature — the same malicious intent can be phrased ten thousand ways. So PromptIntent-v2 is a hybrid: a fast, explainable deterministic detector runs first, and a 7-class ONNX intent classifier runs alongside it as an independent signal. Neither overrides the other blindly — the two are fused into a single decision with a recorded reason.
PromptGuard
Pattern- and rule-based. High precision on known attack shapes, fully explainable, zero model dependency. It anchors the decision — the model can add signal, but can never fabricate a threat the rules didn't see grounds for.
Intent classifier
An INT8-quantized ONNX transformer with a native WordPiece tokenizer, calibrated with temperature scaling into honest probabilities across seven intent classes. It catches novel phrasings the rules miss — and reports how confident it is.
Seven intents, not one alarm
The model classifies what a prompt is trying to do — including "benign" and "ambiguous".
Four classes are attacks (PromptInjection, Jailbreak, SecretExtraction, ToolCoercion); two are safe (Benign, AboutSecurity — a security question is not an attack); one is an explicit Ambiguous class, because a model that is never allowed to say "I'm not sure" will manufacture false confidence. Honest uncertainty is a first-class output.
Shadow-first, by design
The model does not act. It earns the right to, on evidence.
An unproven model that can block is a liability, not a feature. PromptIntent-v2 ships behind a governed enforcement ladder, and this release is hard-locked to the observe rung:
Evidence, not adjectives
"AI-powered" is not a claim; it's a mood. Corxor treats every PromptIntent decision as a measurable event. On the backend, decisions are scored against labels into precision, recall, F1 and false-positive rate, and the model only becomes eligible to move up the ladder when it clears an explicit confidence gate — a large sample, near-zero false positives, and fully reversible actions. Until then it stays in Shadow. That gate is visible to operators, not buried in a datasheet.
This is one concrete instance of a larger thesis: in the AI era the unit of risk is a decision, and decisions must be defended, measured, and explained. We wrote about why that demands a new architecture — an operating system for AI — in our engineering series.
See it in the product. PromptIntent-v2 ships inside QuickSecure and runs on-device — the model itself is proprietary, but every decision it makes is measured, explained, and governed. That's the difference between a claim and evidence.