From Capability Boundaries to Production Security: Rethinking LLM Application Safety
Why LLM application security must move to the infrastructure layer
By Andrew Zheng •
Why LLM application security must move to the infrastructure layer



Jan 26, 2026
Andrew Zheng
In 2026, a joint research team from Tsinghua University and Infron AI published a landmark paper at NDSS 2026 titled, 'Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from Blurred Capability Boundaries.' This study delivered a sobering message to the industry: the security risks of the LLM application ecosystem are far more systemic than most people realize.
The study analyzed metadata from more than 800,000 LLM applications across four major platforms, including GPTs Store, Coze, AgentBuilder, and Poe, while conducting in-depth evaluations on 199 representative apps. One result stands out: 89.45% of mainstream applications exhibit some form of capability-boundary abuse risk.
In other words, most LLM applications in the wild can be pushed beyond their intended scope, misused for unintended purposes, or become unreliable under adversarial interaction.
This is not a theoretical concern. It is already happening at scale.

LLM application development is fundamentally different from traditional software engineering.
In conventional software, developers mainly implement capabilities. In LLM applications, developers increasingly act as capability constrainters: they must define what the application is allowed to do, what it must not do, and under what conditions.
This shift creates a new class of security risks that go beyond classic “jailbreak” narratives. The NDSS paper categorizes them into three major types:
Carefully crafted or malicious inputs can cause an application to degrade in performance on its core task. The paper documents real-world cases in which LLM-based systems in sensitive domains can be steered away from their intended purpose. In controlled experiments, boundary stress tests caused 23.94% to 35.59% performance degradation across six open-source LLMs.
Applications can be abused to perform tasks far beyond their intended scope. In the evaluated dataset, 72.36% (144 out of 199) applications were able to perform more than 15 different categories of tasks, and 17 applications could execute clearly malicious tasks without any adversarial prompt engineering. This implies that enterprise or platform-hosted LLM apps can be weaponized at near-zero marginal cost.
In some cases, attackers can simultaneously bypass both the application-level constraints and the base model’s safety guardrails, enabling near-arbitrary task execution. The study shows that application platforms can significantly lower the barrier for such attacks compared to directly attacking base models.
The paper introduces the formal concept of an LLM application capability space and identifies ambiguous or poorly defined boundaries as the root cause of these risks.
Several ecosystem-level observations stand out:
A large portion of applications lack explicit or enforceable capability constraints in their prompts.
There is a clear positive correlation between prompt design quality and application robustness.
The “super-developer” phenomenon—where a small number of developers publish thousands of apps—amplifies the spread of low-quality, weakly constrained applications.
In short: today’s LLM app ecosystem largely relies on best-effort, non-enforceable guardrails.
The NDSS research does not argue that LLM applications are inherently unsafe. It argues something more important:
Security in LLM apps must be treated as a first-class, enforceable system property—not a prompt-writing best practice.
This is exactly where infrastructure-level governance becomes necessary.
At Infron AI, we took these findings as a blueprint for how LLM application security must evolve in real-world production systems.
Infron AI positions the intelligent gateway as a policy enforcement point for AI workloads. Instead of relying on every application developer to perfectly define and maintain constraints, the gateway makes security:
Centralized
Enforceable
Observable
Auditable
Inspired by the “capability space” model in the research, Infron AI focuses on making boundaries explicit and enforceable:
Automatic constraint injection: Strengthens and standardizes capability constraints before requests reach upstream models.
Real-time boundary monitoring: Continuously inspects outputs for signs of capability drift or unintended task expansion.
Task-category isolation: Keeps applications within predefined functional scopes.
This shifts security from “best-effort prompt discipline” to system-level policy enforcement.
In parallel, Infron AI applies data governance controls that are increasingly required in regulated environments:
Zero Data Retention (ZDR): Requests and responses are deleted after transit through the gateway.
End-to-end encryption: TLS 1.3 for data in transit and AES-256 for any transient storage.
Fine-grained data routing policies: Enterprises can control which data may be sent to which models or providers.
This makes data protection independent of individual upstream provider policies.
The NDSS paper introduces the LLMApp-Eval framework and prompt-quality dimensions such as TScore, PScore, CaScore, and CoScore.
Building on this idea, Infron AI operationalizes similar concepts into:
Application security scoring (AppScore): Evaluate constraint completeness and prompt structure.
Adversarial input detection: Identify known jailbreak and prompt-injection patterns.
Behavioral anomaly analysis: Inspect responses for signs of downgrade, upgrade, or boundary violations.
Infron AI operates across 60+ model providers. This enables:
Risk isolation and automatic failover when a provider exhibits security or reliability issues.
Model allowlisting based on internal security standards.
Tool and plugin access control for capabilities such as web browsing or image generation.
Security Metric | Traditional Approaches | Infron AI Intelligent Gateway |
|---|---|---|
Capability Upgrade Risk | 89.45% of applications affected | Reduced by ~95% via automatic capability constraint enforcement |
Malicious Task Execution | 17 out of 199 applications execute directly | Zero-tolerance policy with real-time blocking |
Data Leakage Risk | Depends on individual provider policies | Zero Data Retention (ZDR) |
Cross-Platform Security Consistency | Highly inconsistent across platforms | Unified security enforcement standard |
Security Response Time | Minutes to hours | Millisecond-level inline protection |
For enterprises, this approach translates into concrete value:
Compliance readiness: Supports regulated industries and audit processes.
Risk reduction: Reduces the probability and blast radius of misuse and data exposure.
Developer productivity: Single API, OpenAI-compatible, with security enforced transparently.
Operational resilience: 99.9% SLA positioning with cross-provider failover.
The NDSS 2026 research makes one thing clear:
The LLM app ecosystem cannot rely on informal guardrails and prompt discipline alone.
Security must be systemic, enforceable, and continuously monitored. By translating capability-boundary research into infrastructure-level enforcement, Infron AI aims to make LLM applications not only more powerful—but also reliably safe to deploy at enterprise scale.
The future of AI is not defined by model intelligence alone. It is defined by whether AI systems can be trusted in real-world production environments.
Trust is not a matter of intention. It is built on:
Clearly defined and enforceable capability boundaries
Policies that are executed by infrastructure, not merely described in documentation
Controls that are measurable, auditable, and continuously monitored
And system architectures that treat security as a first-class design requirement, not an afterthought
As LLM applications move from experimental demos into mission-critical systems, these requirements are no longer optional. They are prerequisites for operating AI at scale. This is the shift the industry must make. And this is precisely the layer Infron AI is building for: making safe, controllable, and auditable AI the default operating model of production systems, rather than a best-effort aspiration.
Reference: Tsinghua University & Infron AI. "Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from Blurred Capability Boundaries." In Proceedings of the NDSS Symposium 2026.
Infron is a next-generation AI infrastructure platform that gives enterprises a single, unified way to access AI models, with built-in intelligent routing, cost optimization, and reliability guarantees. With one API, companies can connect to more than 100 AI model providers worldwide, reducing AI costs, simplifying vendor management, and improving system reliability.
Ready to simplify your AI infrastructure? Contact the Infron team
In 2026, a joint research team from Tsinghua University and Infron AI published a landmark paper at NDSS 2026 titled, 'Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from Blurred Capability Boundaries.' This study delivered a sobering message to the industry: the security risks of the LLM application ecosystem are far more systemic than most people realize.
The study analyzed metadata from more than 800,000 LLM applications across four major platforms, including GPTs Store, Coze, AgentBuilder, and Poe, while conducting in-depth evaluations on 199 representative apps. One result stands out: 89.45% of mainstream applications exhibit some form of capability-boundary abuse risk.
In other words, most LLM applications in the wild can be pushed beyond their intended scope, misused for unintended purposes, or become unreliable under adversarial interaction.
This is not a theoretical concern. It is already happening at scale.

LLM application development is fundamentally different from traditional software engineering.
In conventional software, developers mainly implement capabilities. In LLM applications, developers increasingly act as capability constrainters: they must define what the application is allowed to do, what it must not do, and under what conditions.
This shift creates a new class of security risks that go beyond classic “jailbreak” narratives. The NDSS paper categorizes them into three major types:
Carefully crafted or malicious inputs can cause an application to degrade in performance on its core task. The paper documents real-world cases in which LLM-based systems in sensitive domains can be steered away from their intended purpose. In controlled experiments, boundary stress tests caused 23.94% to 35.59% performance degradation across six open-source LLMs.
Applications can be abused to perform tasks far beyond their intended scope. In the evaluated dataset, 72.36% (144 out of 199) applications were able to perform more than 15 different categories of tasks, and 17 applications could execute clearly malicious tasks without any adversarial prompt engineering. This implies that enterprise or platform-hosted LLM apps can be weaponized at near-zero marginal cost.
In some cases, attackers can simultaneously bypass both the application-level constraints and the base model’s safety guardrails, enabling near-arbitrary task execution. The study shows that application platforms can significantly lower the barrier for such attacks compared to directly attacking base models.
The paper introduces the formal concept of an LLM application capability space and identifies ambiguous or poorly defined boundaries as the root cause of these risks.
Several ecosystem-level observations stand out:
A large portion of applications lack explicit or enforceable capability constraints in their prompts.
There is a clear positive correlation between prompt design quality and application robustness.
The “super-developer” phenomenon—where a small number of developers publish thousands of apps—amplifies the spread of low-quality, weakly constrained applications.
In short: today’s LLM app ecosystem largely relies on best-effort, non-enforceable guardrails.
The NDSS research does not argue that LLM applications are inherently unsafe. It argues something more important:
Security in LLM apps must be treated as a first-class, enforceable system property—not a prompt-writing best practice.
This is exactly where infrastructure-level governance becomes necessary.
At Infron AI, we took these findings as a blueprint for how LLM application security must evolve in real-world production systems.
Infron AI positions the intelligent gateway as a policy enforcement point for AI workloads. Instead of relying on every application developer to perfectly define and maintain constraints, the gateway makes security:
Centralized
Enforceable
Observable
Auditable
Inspired by the “capability space” model in the research, Infron AI focuses on making boundaries explicit and enforceable:
Automatic constraint injection: Strengthens and standardizes capability constraints before requests reach upstream models.
Real-time boundary monitoring: Continuously inspects outputs for signs of capability drift or unintended task expansion.
Task-category isolation: Keeps applications within predefined functional scopes.
This shifts security from “best-effort prompt discipline” to system-level policy enforcement.
In parallel, Infron AI applies data governance controls that are increasingly required in regulated environments:
Zero Data Retention (ZDR): Requests and responses are deleted after transit through the gateway.
End-to-end encryption: TLS 1.3 for data in transit and AES-256 for any transient storage.
Fine-grained data routing policies: Enterprises can control which data may be sent to which models or providers.
This makes data protection independent of individual upstream provider policies.
The NDSS paper introduces the LLMApp-Eval framework and prompt-quality dimensions such as TScore, PScore, CaScore, and CoScore.
Building on this idea, Infron AI operationalizes similar concepts into:
Application security scoring (AppScore): Evaluate constraint completeness and prompt structure.
Adversarial input detection: Identify known jailbreak and prompt-injection patterns.
Behavioral anomaly analysis: Inspect responses for signs of downgrade, upgrade, or boundary violations.
Infron AI operates across 60+ model providers. This enables:
Risk isolation and automatic failover when a provider exhibits security or reliability issues.
Model allowlisting based on internal security standards.
Tool and plugin access control for capabilities such as web browsing or image generation.
Security Metric | Traditional Approaches | Infron AI Intelligent Gateway |
|---|---|---|
Capability Upgrade Risk | 89.45% of applications affected | Reduced by ~95% via automatic capability constraint enforcement |
Malicious Task Execution | 17 out of 199 applications execute directly | Zero-tolerance policy with real-time blocking |
Data Leakage Risk | Depends on individual provider policies | Zero Data Retention (ZDR) |
Cross-Platform Security Consistency | Highly inconsistent across platforms | Unified security enforcement standard |
Security Response Time | Minutes to hours | Millisecond-level inline protection |
For enterprises, this approach translates into concrete value:
Compliance readiness: Supports regulated industries and audit processes.
Risk reduction: Reduces the probability and blast radius of misuse and data exposure.
Developer productivity: Single API, OpenAI-compatible, with security enforced transparently.
Operational resilience: 99.9% SLA positioning with cross-provider failover.
The NDSS 2026 research makes one thing clear:
The LLM app ecosystem cannot rely on informal guardrails and prompt discipline alone.
Security must be systemic, enforceable, and continuously monitored. By translating capability-boundary research into infrastructure-level enforcement, Infron AI aims to make LLM applications not only more powerful—but also reliably safe to deploy at enterprise scale.
The future of AI is not defined by model intelligence alone. It is defined by whether AI systems can be trusted in real-world production environments.
Trust is not a matter of intention. It is built on:
Clearly defined and enforceable capability boundaries
Policies that are executed by infrastructure, not merely described in documentation
Controls that are measurable, auditable, and continuously monitored
And system architectures that treat security as a first-class design requirement, not an afterthought
As LLM applications move from experimental demos into mission-critical systems, these requirements are no longer optional. They are prerequisites for operating AI at scale. This is the shift the industry must make. And this is precisely the layer Infron AI is building for: making safe, controllable, and auditable AI the default operating model of production systems, rather than a best-effort aspiration.
Reference: Tsinghua University & Infron AI. "Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from Blurred Capability Boundaries." In Proceedings of the NDSS Symposium 2026.
Infron is a next-generation AI infrastructure platform that gives enterprises a single, unified way to access AI models, with built-in intelligent routing, cost optimization, and reliability guarantees. With one API, companies can connect to more than 100 AI model providers worldwide, reducing AI costs, simplifying vendor management, and improving system reliability.
Ready to simplify your AI infrastructure? Contact the Infron team
Why LLM application security must move to the infrastructure layer
By Andrew Zheng •

A Technical Roadmap for R&D Teams

A Technical Roadmap for R&D Teams

Infron's multi-provider security architecture

Infron's multi-provider security architecture

Roleplay Model Comparison Guide

Roleplay Model Comparison Guide
Seamlessly integrate Infron with just a few lines of code and unlock unlimited AI power.

Seamlessly integrate Infron with just a few lines of code and unlock unlimited AI power.

Seamlessly integrate Infron with just a few lines of code and unlock unlimited AI power.
