AI Gateway Commercial vs Open Source: How to Choose the Right Control Plane

Posted on May 15, 2026 by steefjan1970

The AI gateway commercial vs. open-source decision is one that most organizations reach not by planning but by accident. One team has already integrated directly with Azure OpenAI. Another is using LiteLLM to wrap a few models. A third wants to use the enterprise API management platform you already have. Suddenly, you need to make a choice, and the conversation gets complicated fast.

This companion post to my APIM for AI Workloads series takes a step back from the Azure API Management specifics and addresses the question that comes before all of it: which gateway should you be using in the first place? The series covers APIM in depth because it’s the right answer for the Microsoft ecosystem. But it’s not the only answer, and for some organizations it’s not the right one.

Here is how to think through the decision properly.

Why the AI Gateway Commercial vs Open Source Choice Matters More Than You Think

Most API gateway decisions are relatively low-stakes. If you pick the wrong one, you migrate. But the AI gateway decision carries more weight for two reasons.

First, the gateway sits in the critical path of every AI interaction in your organization. Its policy language, authentication model, and observability hooks become embedded in the way your teams build AI-powered applications. Switching later is not impossible, but it is disruptive.

Second, the governance patterns you establish now, how you handle token limits, cross-charging, PII, and compliance logging, are much harder to retrofit than to design in from the start. The Team Rockstars IT AI Gateway whitepaper, published this month, makes this point well: organizations that set up audit logging via an AI gateway from day one build a direct compliance advantage under the EU AI Act. Those who add it later risk complex and costly rework.

So the choice deserves deliberate thought, not a default.

The Commercial Options for AI Gateway

Commercial AI gateways offer a faster path to production and offload operational complexity to the vendor. The main options in the market today are:

Azure API Management is the right choice if you are already in the Microsoft ecosystem. Its AI-specific policy extensions for token limits, token metrics, semantic caching, and load balancing across PTU and PAYG backends are mature and tightly integrated with Azure Monitor and Application Insights. The series covers this in depth from Part 1 onwards.

Kong Konnect is a strong option for organizations that already use Kong for API management and want to extend it into AI. Its plugin ecosystem covers rate limiting, authentication, and observability, with AI-specific plugins growing quickly.

Portkey is purpose-built as an AI gateway with a lightweight footprint and fast time-to-value. It supports a broad range of model providers, has built-in semantic caching and observability, and is a practical option for teams that want AI governance without the overhead of a full enterprise API management platform.

Apigee (Google Cloud) is the natural choice for GCP-centric organizations. Like APIM in the Microsoft world, its AI gateway capabilities are deepening with each release as Google embeds Gemini and Vertex AI integrations.

The common advantages across all commercial options are faster deployment, built-in compliance features, vendor support contracts, and operational burden offloaded to the vendor. The common risks are licensing costs, proprietary policy languages that create switching friction, and dependency on the vendor’s roadmap.

The Open Source Options for AI Gateway

Open-source gateways offer maximum control and no licensing costs, but they require your organization to own what the vendor would otherwise handle.

LiteLLM is the most widely adopted open source AI gateway today. It provides a unified API across more than 100 model providers, with built-in rate limiting, spend tracking, and a proxy server that is straightforward to self-host. The community is active, and the feature velocity is high. The supply chain risk is real, though: a 2025 attack targeting LiteLLM and Trivy demonstrated that even widely used security-adjacent tools can become attack vectors. If you run LiteLLM in production, you own the patching cadence.

Agent Gateway from Anthropic is purpose-built for MCP and agentic traffic. If your primary use case is governing tool calls from AI agents rather than managing completion API traffic, it is worth evaluating alongside the broader options.

One API provides a unified, OpenAI-compatible interface across multiple providers and is widely used by organizations seeking provider-agnostic routing without vendor lock-in.

HelixML focuses on self-hosted deployments with strong data-sovereignty properties, making it relevant for organizations where data-residency requirements rule out SaaS-based gateway options.

AI Gateway Commercial vs Open Source: Five Decision Factors

AI gateway commercial vs open source comparison matrix across time to value compliance internal capability flexibility and supply chain risk — Diagram 1: Commercial vs open source AI gateway decision factors. Neither option wins across the board — the right choice depends on your compliance posture, internal capability, and how much operational complexity you want to own.

Five factors consistently determine which direction is right for a given organization:

Time to value. In my experience, commercial gateways can be production-ready in days to weeks. Open source deployments typically take weeks to months to reach production quality, depending on how much custom policy logic you need to build. If you have an urgent compliance or cost control problem to solve, commercial is the pragmatic choice.

Compliance and data residency. For Dutch and European organizations operating under AVG, NIS2, and the EU AI Act, commercial gateways offer contractual guarantees: data processing agreements, certified regions, and SLAs with defined incident response times. Open source can meet the same requirements, but you are responsible for demonstrating compliance yourself rather than relying on a vendor certification.

Internal platform capability. Open source is not free. The licensing cost is zero, but according to the CNCF’s platform engineering maturity model. Organizations without a dedicated platform engineering team that can credibly own the gateway long-term should not choose open source. The operational gap will become visible at the worst possible moment.

Flexibility and lock-in risk. Open source wins on long-term flexibility. Proprietary policy languages in commercial gateways create switching friction that grows over time as you invest in custom policies. If multi-cloud strategy and provider-agnosticism are strategic priorities, design your gateway layer with that in mind from the start, even if you begin with a commercial option, applying the strangler fig pattern to abstract away proprietary dependencies over time.

Supply chain risk. This factor is underweighted in most evaluations. The 2025 supply chain attack targeting LiteLLM and Trivy demonstrated that open source security tooling itself can become an attack vector. Commercial vendors have contractual obligations around vulnerability disclosure and patching. With open source, that obligation falls to your team.

A Decision Framework for AI Gateway Commercial vs Open Source

AI gateway decision flowchart showing when to choose commercial APIM Kong Portkey versus open source LiteLLM Agent Gateway based on compliance capability and cloud ecosystem — *Diagram 2: Decision flowchart for choosing between commercial and open source AI gateways. Compliance requirements, internal capability, and cloud ecosystem fit are the three most decisive factors.*

The flowchart above works through the most decisive questions in order. A few practical observations from applying it:

Regulated industries almost always land in commercial. Healthcare, financial services, and insurance organizations operating under Dutch or European regulation have compliance requirements that are significantly easier to satisfy with contractual vendor guarantees than with self-operated open source tooling. At my company, the AVG and healthcare-specific data processing requirements made APIM the clear choice.

The hybrid pattern is underused. Many organizations run a commercial gateway in production for governed workloads, while developer teams use LiteLLM or a lightweight open source option in lower environments for experimentation. This gives you the compliance and operational properties you need in production while keeping the innovation surface open. It is more work to maintain two gateway patterns, but the tradeoff is often worth it.

Design for replaceability regardless of what you choose. The Team Rockstars whitepaper frames this well: choose your first gateway deliberately, but design for replacement. Use open standards, abstract your policy logic where possible, and avoid deep coupling to proprietary features without open-source equivalents. The gateway landscape is evolving fast enough that what is the right choice today may not be in two years.

Where This Fits in the APIM for AI Workloads Series

The rest of the series goes deep on Azure API Management specifically: the token metric policy, load balancing and circuit breaking, semantic caching, and MCP gateway for agentic workloads. If you have landed on APIM as your gateway of choice or if you are in a Microsoft-centric organization where it is the natural fit, the series covers the production patterns you need.

Part 1: Why your AI APIs need a gateway.
Part 2: Authentication and authorization.
Part 3: Token limit policy.
Part 4: Token metric policy and cross-charging.
Part 5: Load balancing and circuit breaking.
Part 6 (coming June 3): Semantic caching.
Part 7 (coming June 10): APIM as MCP gateway for agentic AI workloads.

Azure API Management for AI: Securing Your AI APIs with Authentication and Authorization

Posted on May 5, 2026 by steefjan1970

Part 2 of 7 in the “APIM for AI Workloads” series

In Part 1 of this series, I made the case for why Azure API Management for AI workloads is the right control plane for governing AI traffic across an organization. This post gets practical: how do you actually secure access to your AI backends with APIM without creating a credential-management nightmare?

Security is where many AI projects cut corners, and understandably so. When you’re moving fast to prove value with a new model, authentication feels like overhead. But AI endpoints are expensive, and an unsecured Azure OpenAI endpoint is a real risk: anyone with the URL and key can start consuming tokens at your cost. At scale, that’s a significant financial and compliance exposure.

APIM addresses this with a three-layer security model. Let’s walk through each layer.

Azure API Management for AI Security: A Three-Layer Model

The authentication and authorization pattern in APIM is deliberately layered. Each layer answers a different question and operates independently, so a failure at any layer stops the request before it reaches the AI backend.

Azure API Management for AI three-layer authentication flow showing subscription key, JWT validation and Managed Identity policy pipeline — *Diagram 1: Three-layer auth in APIM for AI workloads.* Layer 1 identifies the caller via subscription key. JWT validation in Layer 2 then determines what they’re permitted to do. Finally, Layer 3 authenticates APIM itself to the AI backend via Managed Identity.

The three layers are:

Subscription keys to identify and track API consumers.
JWT validation to enforce fine-grained access control based on claims.
Managed Identity to authenticate APIM to Azure OpenAI without storing credentials.

Each layer has a distinct role. Confusing them is a common mistake, so it’s worth being explicit about what each one does and does not do.

Layer 1: Subscription Keys

Subscription keys are APIM’s mechanism for identifying API consumers. When you create an API product in APIM and require a subscription, callers must include their key in the Ocp-Apim-Subscription-Key header. APIM validates the key, maps it to a subscriber, and lets the request proceed.

This is important for AI workloads specifically because subscription keys enable per-consumer token tracking. When you combine subscription key validation with the Token Metric policy we’ll cover in Part 4, you get usage data broken down by subscriber, which is the foundation of any internal cross-charging model.

Subscription keys answer the question: Who is calling? They don’t answer what the caller is allowed to do. For that, you need JWT validation.

Layer 2: JWT Validation and Claims-Based Authorization

The validate-jwt policy is where you enforce what a caller is permitted to do. It validates the JWT token in the Authorization header against your identity provider, and can inspect any claim in the token to make authorization decisions.

For Azure OpenAI specifically, this is where you control which teams or applications can access which model deployments. A team working on an internal chatbot should not be able to call a GPT-4o deployment reserved for a production workload. JWT claims let you enforce that boundary at the gateway layer, with no changes required in the calling application.

A typical policy checks the token signature against your Azure AD tenant’s OpenID Connect configuration, then validates that a required scope or role claim is present:

The failed-validation-httpcode=”401″ attribute ensures unauthenticated callers get a clean rejection before they ever reach the backend. You can also use failed-validation-error-message to return a specific error message, which helps consumers debug auth failures without exposing internal details.

For multi-provider setups where you’re routing to non-Azure backends like Mistral or Cohere, the same JWT policy applies. The claims model is provider-agnostic, which is one of the advantages of centralizing auth in APIM rather than handling it per-backend.

Layer 3: Managed Identity for Backend Authentication

Managed Identity is the most important security improvement you can make when setting up Azure API Management for AI. It replaces the pattern of storing an Azure OpenAI API key in APIM’s named values with a system-assigned or user-assigned Managed Identity that APIM uses to authenticate directly to Azure OpenAI via Azure AD.

Azure API Management for AI comparing API key authentication risks versus Managed Identity benefits for Azure OpenAI backend access — *Diagram 2: API key authentication (left) vs. Managed Identity (right). The key difference is that Managed Identity requires no stored credentials anywhere in your configuration.*

The practical difference is significant. With API key authentication, you have a long-lived secret that needs to be stored, rotated, and kept out of source control. With Managed Identity, there is no secret. APIM requests a short-lived token from Azure AD at runtime, and Azure AD issues it based on the APIM instance’s identity. Nothing is stored. Nothing can leak.

The configuration is a single policy element in the inbound section: <authentication-managed-identity resource=”https://cognitiveservices.azure.com”/>. APIM handles the rest, automatically fetching and refreshing the token.

On the Azure OpenAI side, you grant the APIM instance’s Managed Identity the Cognitive Services User role on the Azure OpenAI resource. That’s the minimum required permission. You can scope it further to specific deployments if needed.

For organizations in regulated industries, such as healthcare, financial services, and government, Managed Identity is not optional. It satisfies Zero Trust authentication requirements and produces a full audit trail in Azure Monitor, tied to the APIM instance identity rather than a shared key.

Azure API Management for AI: Putting the Three Layers Together

In a production setup, all three layers run sequentially within the inbound policy pipeline. A request arrives with a subscription key and a JWT. APIM validates the key first (fast, no external call), then validates the JWT against Azure AD, then forwards the request to Azure OpenAI using its Managed Identity token. The AI backend never sees the caller’s JWT, and APIM never stores an API key.

The result is a clean separation of concerns:

The calling application manages its own JWT (issued by Azure AD based on its own identity or the user’s identity).
APIM enforces the authorization policy without the backend needing to know anything about it.
The AI backend trusts only APIM’s Managed Identity, not arbitrary callers.

This is the architecture you want before you go to production with any AI workload that touches sensitive data or incurs meaningful cost.

What’s Next in This Series

Part 3 covers the Token Limit policy: how to enforce tokens-per-minute limits per consumer, configure throttling behavior, and handle the differences between the azure-openai-token-limit and llm-token-limit policy variants.

Part 3: Token Limit policy — enforcing tokens-per-minute limits per consumer.
Part 4: Token Metric policy — emitting usage data for observability and cross-charging.
Part 5: Load balancing and circuit breaking across PTU and PAYG backends.
Part 6: Semantic caching — reducing token consumption with similarity-based response reuse.
Part 7: APIM as an MCP gateway for agentic AI workloads.

Cloud Perspectives

Steef-Jan Wiggers

Tag Archives: api management