Skip to content

Security, Privacy & Safety

Protecting the AI surface and the data that flows through it: prompt-injection defenses, data-egress controls, PII/secret redaction, output validation, model supply-chain risk, and AI-specific incident response.

Why it matters for AI in Detection Engineering

Section titled “Why it matters for AI in Detection Engineering”

Detection engineering AI surfaces consume some of the most sensitive data in the org: raw telemetry, threat intel, incident notes, sometimes user content. They also generate code that runs against production data. That combination of privileged input plus consequential output makes the AI surface a high-value target. Prompt injection isn’t theoretical when the model is reading attacker-controlled log lines or threat reports.

Output safety also matters. An agent that opens a detection PR with a query the SIEM happily executes against the wrong index can produce a real outage.

  • L0: None. Not applicable. No AI surface to secure.
  • L1: Experimental. No prompt-injection or data-egress controls. PII, secrets, customer data, and proprietary detection logic may leave the org via public chatbots.
  • L2: Integrated. Data-classification enforced at the AI surface: PII/secret redaction, allowed-source lists, model-routing based on sensitivity. Outputs validated against schema/policy before action. Model and provider SBOM tracked.
  • L3: Autonomous. Prompt-injection defenses red-teamed on a schedule. Output validators block unsafe tool use and dangerous actions. Model supply chain monitored for regressions and provenance. AI-specific incident-response runbooks exercised.
  • Redaction at the wrong layer. Redacting in the UI but logging the raw prompt server-side. The data leaves where the data is sent, not where the user sees it.
  • Trusted log content. Treating log fields as trusted strings inside a prompt. Attacker-controlled fields (user agents, filenames, command lines) can carry injection payloads.
  • Unbounded tools. Giving an agent a tool that can drop indexes, modify detections in production, or send messages to customers, with no validator and no sandbox.
  • No model-regression watch. A model provider silently updates a model and your eval pass rate drops 20%. Without monitoring, the first signal is a production incident.