Azure API Management for AI: Securing Your AI APIs with Authentication and Authorization

Posted on May 5, 2026 by steefjan1970

Part 2 of 7 in the “APIM for AI Workloads” series

In Part 1 of this series, I made the case for why Azure API Management for AI workloads is the right control plane for governing AI traffic across an organization. This post gets practical: how do you actually secure access to your AI backends with APIM without creating a credential-management nightmare?

Security is where many AI projects cut corners, and understandably so. When you’re moving fast to prove value with a new model, authentication feels like overhead. But AI endpoints are expensive, and an unsecured Azure OpenAI endpoint is a real risk: anyone with the URL and key can start consuming tokens at your cost. At scale, that’s a significant financial and compliance exposure.

APIM addresses this with a three-layer security model. Let’s walk through each layer.

Azure API Management for AI Security: A Three-Layer Model

The authentication and authorization pattern in APIM is deliberately layered. Each layer answers a different question and operates independently, so a failure at any layer stops the request before it reaches the AI backend.

Azure API Management for AI three-layer authentication flow showing subscription key, JWT validation and Managed Identity policy pipeline — *Diagram 1: Three-layer auth in APIM for AI workloads.* Layer 1 identifies the caller via subscription key. JWT validation in Layer 2 then determines what they’re permitted to do. Finally, Layer 3 authenticates APIM itself to the AI backend via Managed Identity.

The three layers are:

Subscription keys to identify and track API consumers.
JWT validation to enforce fine-grained access control based on claims.
Managed Identity to authenticate APIM to Azure OpenAI without storing credentials.

Each layer has a distinct role. Confusing them is a common mistake, so it’s worth being explicit about what each one does and does not do.

Layer 1: Subscription Keys

Subscription keys are APIM’s mechanism for identifying API consumers. When you create an API product in APIM and require a subscription, callers must include their key in the Ocp-Apim-Subscription-Key header. APIM validates the key, maps it to a subscriber, and lets the request proceed.

This is important for AI workloads specifically because subscription keys enable per-consumer token tracking. When you combine subscription key validation with the Token Metric policy we’ll cover in Part 4, you get usage data broken down by subscriber, which is the foundation of any internal cross-charging model.

Subscription keys answer the question: Who is calling? They don’t answer what the caller is allowed to do. For that, you need JWT validation.

Layer 2: JWT Validation and Claims-Based Authorization

The validate-jwt policy is where you enforce what a caller is permitted to do. It validates the JWT token in the Authorization header against your identity provider, and can inspect any claim in the token to make authorization decisions.

For Azure OpenAI specifically, this is where you control which teams or applications can access which model deployments. A team working on an internal chatbot should not be able to call a GPT-4o deployment reserved for a production workload. JWT claims let you enforce that boundary at the gateway layer, with no changes required in the calling application.

A typical policy checks the token signature against your Azure AD tenant’s OpenID Connect configuration, then validates that a required scope or role claim is present:

The failed-validation-httpcode=”401″ attribute ensures unauthenticated callers get a clean rejection before they ever reach the backend. You can also use failed-validation-error-message to return a specific error message, which helps consumers debug auth failures without exposing internal details.

For multi-provider setups where you’re routing to non-Azure backends like Mistral or Cohere, the same JWT policy applies. The claims model is provider-agnostic, which is one of the advantages of centralizing auth in APIM rather than handling it per-backend.

Layer 3: Managed Identity for Backend Authentication

Managed Identity is the most important security improvement you can make when setting up Azure API Management for AI. It replaces the pattern of storing an Azure OpenAI API key in APIM’s named values with a system-assigned or user-assigned Managed Identity that APIM uses to authenticate directly to Azure OpenAI via Azure AD.

Azure API Management for AI comparing API key authentication risks versus Managed Identity benefits for Azure OpenAI backend access — *Diagram 2: API key authentication (left) vs. Managed Identity (right). The key difference is that Managed Identity requires no stored credentials anywhere in your configuration.*

The practical difference is significant. With API key authentication, you have a long-lived secret that needs to be stored, rotated, and kept out of source control. With Managed Identity, there is no secret. APIM requests a short-lived token from Azure AD at runtime, and Azure AD issues it based on the APIM instance’s identity. Nothing is stored. Nothing can leak.

The configuration is a single policy element in the inbound section: <authentication-managed-identity resource=”https://cognitiveservices.azure.com”/>. APIM handles the rest, automatically fetching and refreshing the token.

On the Azure OpenAI side, you grant the APIM instance’s Managed Identity the Cognitive Services User role on the Azure OpenAI resource. That’s the minimum required permission. You can scope it further to specific deployments if needed.

For organizations in regulated industries, such as healthcare, financial services, and government, Managed Identity is not optional. It satisfies Zero Trust authentication requirements and produces a full audit trail in Azure Monitor, tied to the APIM instance identity rather than a shared key.

Azure API Management for AI: Putting the Three Layers Together

In a production setup, all three layers run sequentially within the inbound policy pipeline. A request arrives with a subscription key and a JWT. APIM validates the key first (fast, no external call), then validates the JWT against Azure AD, then forwards the request to Azure OpenAI using its Managed Identity token. The AI backend never sees the caller’s JWT, and APIM never stores an API key.

The result is a clean separation of concerns:

The calling application manages its own JWT (issued by Azure AD based on its own identity or the user’s identity).
APIM enforces the authorization policy without the backend needing to know anything about it.
The AI backend trusts only APIM’s Managed Identity, not arbitrary callers.

This is the architecture you want before you go to production with any AI workload that touches sensitive data or incurs meaningful cost.

What’s Next in This Series

Part 3 covers the Token Limit policy: how to enforce tokens-per-minute limits per consumer, configure throttling behavior, and handle the differences between the azure-openai-token-limit and llm-token-limit policy variants.

Part 3: Token Limit policy — enforcing tokens-per-minute limits per consumer.
Part 4: Token Metric policy — emitting usage data for observability and cross-charging.
Part 5: Load balancing and circuit breaking across PTU and PAYG backends.
Part 6: Semantic caching — reducing token consumption with similarity-based response reuse.
Part 7: APIM as an MCP gateway for agentic AI workloads.

My Azure Security Journey so far

Posted on December 26, 2022 by steefjan1970

I like to travel, explore and admire new environments. Similarly, in my day-to-day job, I want to explore new technologies, look at architectural challenges with the solutions I design, and help engineers.

Exploring is my second nature; it’s my curiosity and desire to learn – experience new things. With Cloud Computing, many developments happen daily, including new services, updates, and learnings. I like that, and with my role at InfoQ, I can cover these developments through news stories. Moreover, in my day job, I deal with cloud computing daily, specifically Microsoft Azure and integrating systems through Integration Services.

Exams

An area that got my attention this year was governance and security. I wrote two blogs this year – a blog on secret management in the cloud and one titled a high-level view of governance. In addition, I started exploring resources from Microsoft on Governance and Security on their learning platform. And recently, I planned to prepare for some certifications for that matter with:

Exam SC-900: Microsoft Security, Compliance, and Identity Fundamentals
Exam AZ-500: Microsoft Azure Security Technologies
Exam SC-100: Microsoft Cybersecurity Architect

I passed the first, and the other two are scheduled for Q1 in 2023.

The goal of preparing for the exams is learning more about security, as its an important aspect when designing integration solutions in Azure.

Screenshot showing security design areas.

Source: https://learn.microsoft.com/en-us/azure/architecture/framework/security/overview

Another good source is the well-architected framework: Security Pillar.

New Items

The dominant three public cloud providers, Microsoft, AWS, and Google, provide services and guidance on security on their platforms. As a cloud editor at InfoQ, I sometimes cover stories on their products, open-source initiatives, and architecture. Here’s a list of security and governance-related news items I wrote in 2022:

Source: https://github.com/ine-labs/AzureGoat#module-1

Books

Next to writing news items, my day-to-day job, traveling, and sometimes running, I read books. The security-related books I read and am reading are:

Mastering Azure Security: Keeping your Microsoft Azure workloads safe, 2nd Edition by Mustafa Toroman and Tom Janetscheck. The book introduces different areas, from identity to network security. In addition, there’s a GitHub repo containing code.
And a book Manning Publications pointed out to me called Azure Security, a Manning Early Access Program (MEAP) book by Bojan Magusic.

Another one I might get is a recent book published by APress titled: Azure Security For Critical Workloads: Implementing Modern Security Controls for Authentication, Authorization, and Auditing by Sagar Lad.

Microsoft Valuable Professional Security

Another thing I recently learned is that there is a new award category within the MVP program: Azure Security. The focus for this area lies on contributions in:

Cloud Security in general on Azure, think about Microsoft Azure services like Key Vault, Firewall, Policy, and concepts like Zero Trust Model and Defense in Depth.
Identity & Access, including management, hence Azure Active Directory (AAD) or, in general, Microsoft Entra.
Security Information and Event Management (SIEM) & Extended Detection and Response (XDR) – think about Microsoft’s product Sentinel.

Lastly, I am looking forward to 2023, which will bring me new challenges, destinations to travel to, and hopefully, success in passing the exams I have lined up for myself.

Secret Management in the Cloud

Posted on January 22, 2022 by steefjan1970

I have been using Azure Key Vault for secret management for the last two or three years in my projects or advice my peers, client, and colleagues I work with to do so. Azure Key Vault is a service that provides storing and managing secrets with policies and the ability to access them using .NET code. Moreover, it is not just .NET yet also a service principal that can access it to get a secret for establishing a connection or a pipeline. The secrets can be API keys, connection strings, credentials, certificates, etc. I like to discuss a secret management use case in this blog post and dive into its details.

Use case Key Vault and D365 FO Business Events

In a recent project regarding unlocking data from a Dynamics 365 Finance and Operations (FO) instance, I leveraged the concept of Business Events, where a Logic App subscribes to a specific event published on a custom Event Grid Topic. Let me further explain the scenario and where Key Vault comes into play. Below you see a diagram of integration between D365 FO and third party system. The latter receives data from D365 based upon a specific business event.

Within D365 FO, you can define a destination for a business event. As shown in the diagram, the destination is an Event Grid Topic. When following the Microsoft documentation of Business Events and Event Grid, you will notice that a Key Vault is required to keep the access key of the Event Grid Topic as a secret. Furthermore, you will need to create a so-called App registration in

Azure Active Directory. Azure App registrations are a simple and effective way to configure authentication and authorization workflows for many client types. In this case, a client identifying D365 – allowing access to the Key Vault instance to extract the access key for the custom Event Grid Topic.

Once the app registration is in place, the next step is to add it to the access policies in the Key Vault instance. The registration represents D365, and it needs access to the Key Vault to extract the access key for the Azure Event Grid topic. The app registration only requires the Get and List secret permissions to retrieve the Key Vault secrets.

The endpoint configuration is the next step when the app registration and policy are in place, the custom Event Grid topic is available, and its access key is a secret in Key Vault. The screenshot below shows the configuration of an actual endpoint (destination) for the events – the custom Event Grid topic.

For configuring the endpoint (destination), you need to provide a name. So first, the endpoint type is filled in by default, followed by the endpoint URL (destination endpoint – Event Grid topic URL) and then the details for the Key Vault. These details are the client id of the app registration, its secret, the DNS name of the Key Vault instance, and key vault secret name – which has the secret, i.e., access key to the custom Event Grid topic. And finally, you can press Ok for the creation of the endpoint. You can subsequently attach the endpoint to the necessary business event and activate it when the endpoint is created.

Once the endpoint is active and a specific business event is attached to the endpoint, the event will end up with the subscriber – Logic App. An example of a business event is shown below:

{
  “BusinessEventId”: “PurchaseOrderConfirmedBusinessEvent”,
  “ControlNumber”: 5637365024,
  “EventId”: “9D42A382-12E8-48F6-9BB2-29A1G4E39773”,
  “EventTime”: “/Date(1642759229000)/”,
  “LegalEntity”: “fnl1”,
  “MajorVersion”: 0,
  “MinorVersion”: 0,
  “PurchaseJournal”: “PO1-002342-11”,
  “PurchaseOrderDate”: “/Date(1642723200000)/”,
  “PurchaseOrderNumber”: “PO1-002342”,
  “PurchaseType”: “Purch”,
  “TransactionCurrencyAmount”: 1553.46,
  “TransactionCurrencyCode”: “EUR”,
  “VendorAccount”: “IFF1095”
}

The Logic App can use the details to retrieve more information (through OData calls) about the purchase order in this case. And as shown in the diagram, send the enriched json to a service bus queue to handover the another Logic App to transform it into an XML to be sent to an application Basware (provider of software for financial processes, purchase to pay, and financial management).

Managing Key Vault

To properly set up the process around Key Vault and secrets, the administrator (Azure Ops) is responsible for creating the app registration. The administrator will make the app registration and manage the Key Vault. Moreover, the person is also the one in my view that does the endpoint configuration. Therefore, the integration developer will only need to connect the Logic App to the Event Grid topic. Similarly, the SFTP connection requiring credentials or certificates can also leverage the Key Vault and require the same administrator.

The diagram below shows what the administrator can do regarding the app registration and managing the Key Vault instance. Also, the authentication process is shown from the application side – in our case, creating the endpoint from D365. Finally, D365 will use the app registration to authenticate against Azure AD to retrieve a token necessary to access the key vault secret.

I like to point here regarding this scenario that business events might need to be set up again when a database refresh is done. Note that when the endpoint configuration fails, you can see an error like:

Unable to get secret from Key Vault DNS: <dns of the key vault instance> Secret name: <name secret>

In that case, either the app registration client id or secret is wrong, or worse, the app registration is expired (the error messages will not tell you that!). An app registration expires (the max is two years). Hence, be aware that the events when the app registration is expired will not reach the Event Grid topic, and errors will occur on the D365 side. Therefore, I recommend monitoring the expiration for the app registration, and also, the secrets can have an expiry date – so keep an eye on that too!

Other Cloud Public Cloud Providers

Interestingly, Azure is not the only public cloud platform with a secret certificate and key management service. For example, AWS actually has three services – AWS Secrets Management, AWS Certificate Manager, and AWS CloudHSM. With AWS Secrets Manager, users can manage access to secrets using a fine-grained set of policies, control the lifecycle of secrets, and secure and audit secrets centrally. Furthermore, this is a managed service with a pay-as-you-go model available in most AWS regions. Sound familiar? Azure Key Vault is similar, right? Almost, yet Key Vault has most of the capabilities found in the three earlier mentioned AWS Services.

What about the Google Cloud Platform? Well, on GCP, you will find Secret Manager, which also enables users to store and manage secrets, including policies and rotation. Furthermore, the service offers management of certificates. And lastly, the public cloud has a separate service for key management with Key Management Service (KMS).

Cloud Perspectives

Steef-Jan Wiggers

Category Archives: Security