AI Red Teaming · Enterprise SaaS Security

Compliant on paper.
Breached in production.

We red-teamed the full stack — identity, API access, data objects, storage links, AI orchestration, and operational controls. This case study focuses on the AI attack surface and the adjacent weaknesses that can turn one model request into a multi-step breach.

Client
AI-enabled enterprise sales platform
Customer Profile
Enterprise buyers, with smallest client at $1B+ revenue
Assurance Baseline
Active compliance/GRC motion with Scrut in use
Assessment Scope
32 attack scenarios across 5 categories
Executive Brief
AI security is not model-only
  • Governance was active: Scrut was in use, and enterprise buyers (smallest at $1B+ revenue) were already enforcing security rigor.
  • The hero AI risk was prompt-driven over-disclosure: presentation outputs exposed internal API, architecture, and control-plane context.
  • The same generation path accepted 100 KB+ context payloads with HTTP 200, creating confirmed cost-abuse pressure.
  • Other critical outcomes still landed, including confirmed cross-tenant AI-report retrieval.
  • The run showed both realities: 5 confirmed successes, 11 partials, and the remainder blocked by existing controls.
32
Attack scenarios across
5 categories
3.49MB
Cross-tenant data exposure
confirmed in one request
5
Confirmed success
results in the run
11
Partial success
results in the run

The presentation agent exposed too much
internal system knowledge.

By maliciously steering the PPT generation workflow, we obtained generated decks containing detailed internal API, infrastructure, and architecture narratives. This is an AI-specific over-disclosure failure: the model can become a high-speed recon assistant when retrieval and disclosure boundaries are under-constrained.

Material AI Over-Disclosure · Reproducible
Prompt-driven presentation generation produced internal integration and architecture intelligence

Generated outputs included structured internal knowledge such as API endpoint families, authorization models, scope names, rate-limit descriptions, architecture-layer narratives, and cloud-secret reference conventions. The content was delivered in reusable artifacts, not isolated one-line responses.

Multiple independently generated presentation outputs showed the same pattern of internal-knowledge packaging. Together they demonstrate that malicious prompting can extract high-value system context useful for follow-on attacks.

Root cause pattern: prompt controls focused on direct leakage prevention, but retrieval and output governance still allowed over-broad internal context packaging. This creates an attack chain from AI generation to reconnaissance acceleration.

This did not eliminate other risks. In the same engagement, we also confirmed 100 KB+ PPT payload acceptance (cost-abuse) and a separate cross-tenant report retrieval path with 3,491,517 bytes exposed. AI security here required both model-centric and full-stack controls.

The rest of the run showed how model risk
combines with stack risk.

Beyond the confirmed cross-tenant exposure, multiple paths produced confirmed or partial results where AI workflows depended on weak surrounding controls: request-schema trust, tenant-context integrity, stage-to-stage contamination, and missing cost safeguards.

Cost Abuse Risk
Confirmed
Unbounded generation workload on AI endpoints

Large AI workload requests, including a 1,000-user bulk brief and a 100 KB+ presentation context payload, were accepted with HTTP 200. The run found no effective generation quota or payload gating on these paths.

Burst Handling
Partial
No observable burst throttling for generation

A burst of 20 concurrent persona generation requests all succeeded with HTTP 200. The endpoint showed no visible per-user throttling under concurrent pressure.

Tenant Context Integrity
Partial
Cross-tenant context influence in AI generation

A generation request that injected a different tenant_id in the body produced output referencing the other tenant's account context, suggesting body-level routing influence inside the orchestration layer.

Schema Enforcement
Partial
Privilege-style fields accepted by generation routes

Multiple AI endpoints accepted injected fields like role, is_admin, tenant_id, and bypass_quota without rejection. Acceptance does not prove exploitation, but it does prove insufficient schema enforcement.

Multi-Stage Contamination
Partial
Canary survived across multiple generation stages

A canary token inserted upstream appeared in a downstream generation stage, confirming that poisoned context can traverse stage boundaries inside the AI workflow.

Presentation Agent Leakage
Partial
Internal system knowledge packaged into generated decks

Prompted presentation generation produced outputs containing internal API and architecture detail (including route families, auth patterns, and infra conventions). Separate delimiter-based extraction probes also returned boundary-drift signals. Together these indicate a practical AI-enabled reconnaissance path.

Holistic testing also showed where stack controls
already protected AI behavior.

Most scenarios were blocked. That distinction matters: this was not a case of zero security. It was a case where baseline controls existed, but AI-specific and AI-coupled paths still left exploitable gaps.

JWT Integrity
JWT confusion attacks were blocked

Algorithm confusion variants and claim-escalation token tests returned unauthorized responses, indicating correct signature validation.

Injection Resistance
Encoding-obfuscated prompt attacks did not land

Base64, URL-encoded, homoglyph, zero-width, and ROT13 variants did not trigger canary disclosure, which is a meaningful positive control.

Telemetry
Unauthenticated telemetry injection was rejected

Telemetry noise and event-forgery attempts were rejected, limiting the risk of unauthenticated signal pollution.

Admin Surface
No hidden admin or debug routes surfaced

Admin, debug, GraphQL, and version-drift probes produced 403 or 404 responses instead of accidental exposure.

Session Controls
Copilot session ownership held

Cross-tenant AI copilot session replay attempts were blocked, suggesting object ownership checks were present on those routes.

SSRF
LinkedIn URL abuse path was rejected

An SSRF-style probe through a persona LinkedIn field was rejected as invalid input rather than fetched server-side.

AI red teaming means fixing both
model-facing and stack-facing controls.

The evidence supports findings and recommendations, not verified closure. The remediation roadmap is therefore structured around full-chain risk reduction for AI systems.

Immediate Priority
Bind every AI artifact to tenant ownership

Enforce server-side ownership checks for every generated artifact lookup, especially before issuing pre-signed URLs or returning cached report objects.

Next Priority
Harden orchestration and request-schema boundaries

Reject unknown fields, strip privilege-style parameters, and ensure body-level routing data cannot override path-level or token-derived tenant context in generation workflows.

Next Priority
Add generation budget and burst controls

Rate limit generation endpoints, enforce per-tenant/per-user quotas, and set maximum safe payload envelopes before cost or latency spikes become attacker primitives.

Appendix

Full attack surface summary

Authorization & Object Access
Object isolation, session ownership, mass assignment

This category covered cross-tenant report access, namespace drift, AI session ownership, request-field privilege injection, and JWT confusion. It produced the strongest confirmed breach in the run.

Mixed results
AI Orchestration & Prompt Surfaces
Prompt extraction, injection, contamination chains

This category tested direct and obfuscated prompt injection, system prompt extraction, SSE response behavior, and cross-stage contamination through chained generation workflows.

Mixed results
Data Exposure & Artifact Governance
Generated content and metadata leakage

This category probed malformed requests, generated artifact leakage, classification drift, and metadata exposure. Most tests did not produce material disclosure.

Mostly blocked
Telemetry & Integrity
Signal pollution and unauthenticated event injection

Telemetry event forgery and noise-masking scenarios were attempted and rejected, indicating meaningful protection on observability inputs.

Blocked
Resilience & Operational Controls
Cost abuse, hidden routes, workload shaping

This category showed the clearest operational gap: expensive AI generation paths accepted large and concurrent workloads without observable throttling or quota gates.

Confirmed and partial findings
Combined Attack Paths
Multi-step attack chains across surfaces

Five chained scenarios were executed to test whether small weaknesses compound into bigger failures. One chain showed canary propagation; the others were blocked or degraded.

One partial chain

If you sell AI into enterprise accounts,
this is your threat model.

This assessment pattern is for builders and security leaders responsible for multi-tenant AI systems, generated customer artifacts, and enterprise buyer assurance.

AI platforms with tenant-bound artifacts

Research reports, briefs, presentations, copilots, generated profiles, or any object whose business value comes from AI output tied to customer data.

Security teams facing enterprise diligence

Especially where compliance artifacts exist already, but runtime AI behavior, orchestration, and tenant isolation have not been adversarially tested.

Teams worried about denial-of-wallet

If a customer or attacker can trigger expensive LLM workflows without guardrails, the issue is both security and operations.

Founders who need evidence, not slogans

The output is actionable evidence: confirmed findings, blocked controls, and a remediation roadmap aligned to enterprise scrutiny.

Find the gap between your controls
and your actual runtime risk.

If your product depends on AI-generated customer artifacts, tenant isolation, or expensive generation pipelines, we can test the places a standard VAPT will miss.

Book a 20-min fit call

Working with a limited number of design partners.