In the previous post we deployed a working Microsoft Foundry Citadel Platform on Azure Sweden Central, a Governance Hub built on Azure API Management and an Agent Spoke built on Azure AI Foundry. We validated the setup with a raw chat completion call through the APIM gateway. That proved the plumbing works. This post takes the next step: connecting a real tool-calling agent to the Microsoft Foundry Citadel Platform on Azure, using the Open-Meteo weather API as a tool, and showing that every LLM call flows through the hub’s governance layer.
The agent is built with the standard Azure OpenAI SDK pointed directly at the Citadel APIM gateway. It uses a custom function tool that calls the Open-Meteo API to retrieve real current weather data for any location. The governance hub intercepts all traffic: content safety policies fire, token usage is tracked, and telemetry flows into Application Insights. This is the Microsoft Foundry Citadel Platform doing what it is designed to do.
What We Build
The flow looks like this:

Two LLM calls flow through APIM per agent run: the tool decision call and the synthesis call. Both are governed, appear in Application Insights, and contribute to Cosmos DB usage tracking.
Why Open-Meteo and Why the Standard OpenAI SDK
The original plan was to use the Azure AI Foundry Agent Service SDK with Bing Search grounding. Two blockers emerged:
Bing Search SKU eligibility: The Grounding with Bing Search resource (G1 SKU) requires Pay-As-You-Go or EA subscriptions and is not available on MVP or MSDN subscriptions.
AI Foundry Agent Service routing: The azure-ai-projects SDK routes LLM calls through the AI Foundry project’s internal endpoint (aif-tggi2gmkw22w4.openai.azure.com) rather than through APIM, bypassing the governance layer. In addition, even after adding APIM as a connected resource in the AI Foundry portal, the Agent Service does not honor it for model routing in the current preview version.
The solution, therefore, is to use the standard OpenAI Python SDK pointed directly at the APIM gateway endpoint. This guarantees that all traffic flows through the hub; consequently, the tool-calling loop is implemented explicitly in Python, and the governance telemetry is fully captured in Application Insights.
Open-Meteo is a free, open-source weather API; therefore, it requires no API key and returns structured JSON weather data. Additionally, it serves as a clean stand-in for any external API your agents might call in production.
Prerequisites
From the previous post you should have:
- Hub deployed in
rg-ai-hub-gateway-devwith APIM gateway URLhttps://apim-wpvlimv4ngkns.azure-api.netand subscription key - Spoke deployed in
rg-ai-spoke-devwith App Configappcs-tggi2gmkw22w4containingAPIM_GATEWAY_URLandAPIM_SUBSCRIPTION_KEY - Your principal ID with
App Configuration Data Readerrole on the spoke App Config
For this post you additionally need Python 3.11 or later installed locally.
Step 1 — Set Up the Python Environment
mkdir citadel-agent && cd citadel-agentpython -m venv .venv# Windows.venv\Scripts\activatepip install openaipip install azure-appconfigurationpip install azure-identitypip install requests
Step 2 — Read Configuration from App Config
Create config.py using Set-Content to avoid BOM issues on Windows:
$lines = @( "from azure.appconfiguration import AzureAppConfigurationClient", "from azure.identity import DefaultAzureCredential", "", "APP_CONFIG_ENDPOINT = 'https://appcs-tggi2gmkw22w4.azconfig.io'", "LABEL = 'ai-lz'", "", "def get_config() -> dict:", " credential = DefaultAzureCredential()", " client = AzureAppConfigurationClient(", " base_url=APP_CONFIG_ENDPOINT,", " credential=credential", " )", " keys = [", " 'AI_FOUNDRY_PROJECT_ENDPOINT',", " 'CHAT_DEPLOYMENT_NAME',", " 'APIM_GATEWAY_URL',", " 'APIM_SUBSCRIPTION_KEY',", " ]", " config = {}", " for key in keys:", " setting = client.get_configuration_setting(key=key, label=LABEL)", " config[key] = setting.value", " return config", "", "if __name__ == '__main__':", " cfg = get_config()", " for k, v in cfg.items():", " print(f'{k}: {v[:30]}...')")[System.IO.File]::WriteAllLines("$PWD\config.py", $lines, [System.Text.UTF8Encoding]::new($false))
Test it:
python config.py
All four keys should return truncated values. If you get a 403, wait 2–5 minutes for role assignment propagation and retry.
Pitfall: Always Use WriteAllLines for Python Files on Windows
Out-File -Encoding utf8NoBOM and @"..."@ | Out-File both add a BOM on some Windows PowerShell versions, causing Python to throw SyntaxError: Non-UTF-8 code starting with '\xff'. Use [System.IO.File]::WriteAllLines with [System.Text.UTF8Encoding]::new($false) to write files without BOM.
Step 3 — Define the Weather Tool
Create tools.py:
$lines = @( "import json", "import requests", "", "def get_weather(location: str) -> str:", " try:", " geo = requests.get(", " 'https://geocoding-api.open-meteo.com/v1/search',", " params={'name': location, 'count': 1, 'language': 'en', 'format': 'json'},", " timeout=10", " )", " geo.raise_for_status()", " geo_data = geo.json()", " if not geo_data.get('results'):", " return json.dumps({'error': f'Location not found: {location}'})", " r = geo_data['results'][0]", " weather = requests.get(", " 'https://api.open-meteo.com/v1/forecast',", " params={'latitude': r['latitude'], 'longitude': r['longitude'], 'current_weather': True, 'wind_speed_unit': 'kmh', 'timezone': 'auto'},", " timeout=10", " )", " weather.raise_for_status()", " c = weather.json()['current_weather']", " codes = {0:'Clear sky',1:'Mainly clear',2:'Partly cloudy',3:'Overcast',45:'Foggy',61:'Slight rain',63:'Moderate rain',65:'Heavy rain',71:'Slight snow',80:'Showers',95:'Thunderstorm'}", " return json.dumps({'location': f'{r[chr(110)+(chr(97)+chr(109)+chr(101))]}, {r.get(chr(99)+chr(111)+chr(117)+chr(110)+chr(116)+chr(114)+chr(121),chr(32))}', 'temperature_celsius': c['temperature'], 'wind_speed_kmh': c['windspeed'], 'wind_direction_degrees': c['winddirection'], 'condition': codes.get(c['weathercode'],'Unknown'), 'is_day': bool(c['is_day'])})", " except Exception as e:", " return json.dumps({'error': str(e)})", "", "WEATHER_TOOL_DEFINITION = {", " 'type': 'function',", " 'function': {", " 'name': 'get_weather',", " 'description': 'Get current weather for a location. Returns temperature in Celsius, wind speed, condition.',", " 'parameters': {", " 'type': 'object',", " 'properties': {'location': {'type': 'string', 'description': 'City name e.g. Stockholm'}},", " 'required': ['location']", " }", " }", "}")[System.IO.File]::WriteAllLines("$PWD\tools.py", $lines, [System.Text.UTF8Encoding]::new($false))
Test it:
python -c "from tools import get_weather; print(get_weather('Stockholm'))"
Step 4 — Create the Agent
Create agent.py using the standard openai SDK pointed directly at the APIM gateway:
$lines = @( "import json", "from openai import AzureOpenAI", "from config import get_config", "from tools import get_weather, WEATHER_TOOL_DEFINITION", "", "def run_agent(user_question: str) -> str:", " cfg = get_config()", "", " # Strip /openai suffix - AzureOpenAI SDK adds it automatically", " apim_base = cfg['APIM_GATEWAY_URL'].rstrip('/').replace('/openai', '')", "", " client = AzureOpenAI(", " azure_endpoint=apim_base,", " api_key=cfg['APIM_SUBSCRIPTION_KEY'],", " api_version='2024-02-01',", " )", "", " messages = [{'role': 'user', 'content': user_question}]", " print(f'Sending request via APIM: {apim_base}')", "", " # First LLM call - agent decides whether to use the tool", " response = client.chat.completions.create(", " model=cfg['CHAT_DEPLOYMENT_NAME'],", " messages=messages,", " tools=[WEATHER_TOOL_DEFINITION],", " tool_choice='auto',", " )", "", " msg = response.choices[0].message", " messages.append(msg)", "", " # Handle tool calls if the agent decided to use get_weather", " if msg.tool_calls:", " for tool_call in msg.tool_calls:", " args = json.loads(tool_call.function.arguments)", " print(f' -> Tool call: get_weather({args})')", " result = get_weather(**args)", " print(f' -> Tool result: {result}')", " messages.append({", " 'role': 'tool',", " 'tool_call_id': tool_call.id,", " 'content': result,", " })", "", " # Second LLM call - synthesise grounded response", " response = client.chat.completions.create(", " model=cfg['CHAT_DEPLOYMENT_NAME'],", " messages=messages,", " )", " return response.choices[0].message.content", "", " return msg.content", "", "if __name__ == '__main__':", " question = 'What is the weather like in Stockholm right now?'", " print(f'Question: {question}')", " answer = run_agent(question)", " print(f'Answer: {answer}')")[System.IO.File]::WriteAllLines("$PWD\agent.py", $lines, [System.Text.UTF8Encoding]::new($false))
Run it:
python agent.py
A successful run looks like this:

Pitfall: APIM Endpoint Format
The AzureOpenAI SDK constructs the full path as {azure_endpoint}/openai/deployments/{model}/chat/completions. If your APIM_GATEWAY_URL in App Config contains /openai at the end, strip it before passing to the client; otherwise, the SDK builds a doubled path (/openai/openai/...) that returns a 500 from APIM. The line apim_base = cfg['APIM_GATEWAY_URL'].rstrip('/').replace('/openai', '') handles this automatically.
After running the agent, check Application Insights in the hub:
az monitor app-insights query ` --app <Your APIM instance Name> ` --resource-group rg-ai-hub-gateway-dev ` --analytics-query "requests | where timestamp > ago(10m) | project timestamp, name, resultCode, duration | order by timestamp desc" ` --output table
Pitfall: CLI vs Portal Ingestion Lag
The CLI query hits the Log Analytics store; however, it has a 5–10-minute ingestion lag. In contrast, the Azure Portal Application Insights blade uses a live metrics path and shows results immediately. Therefore, if the CLI returns an empty response, it’s a good idea to check the portal directly, go to the APIM instance → Performance to view requests in real time.
What Governed Traffic Looks Like in the Portal
The Application Insights Performance blade shows two operation types per agent run:
azure-openai-service-api:rev=1 - ChatCompletions_Create— the APIM policy-matched operation, showing the governed calls with content safety appliedPOST /openai/openai/deployments/chat/chat/completions— the raw endpoint calls
Each agent run generates two successful requests (tool decision + synthesis), both with response code 200 and latency around 900ms–1.2s for gpt-4o. Failed attempts from earlier endpoint format issues show as 500s and are clearly distinguishable.

The Azure AI Foundry Agent Service SDK — What We Learned
For completeness, here is a summary of what we discovered when attempting to use the azure-ai-projects SDK before switching to the standard OpenAI SDK:
| Issue | Detail |
|---|---|
FunctionTool import path | Must import from azure.ai.agents.models, not azure.ai.projects.models |
create_thread does not exist | Use create_thread_and_process_run instead |
list_messages does not exist | Use client.agents.messages.list(thread_id=...) |
MessageRole.ASSISTANT does not exist | Use the string "assistant" directly |
enable_auto_function_calls(toolset=...) fails | Parameter is tools=, not toolset= |
| Function not found error | Call client.agents.enable_auto_function_calls(tools=toolset) before create_agent |
| Agent traffic bypasses APIM | AI Foundry Agent Service uses its own endpoint resolution — use standard OpenAI SDK pointed at APIM instead |
The Agent Service SDK is in active beta development (azure-ai-agents==1.2.0b6 at the time of writing). Expect these APIs to stabilise and the APIM routing issue to be addressed in future versions.
Pitfalls Summary
| Pitfall | Fix |
|---|---|
| Grounding with Bing Search G1 SKU not eligible | Requires Pay-As-You-Go or EA subscription |
Bing.Search.v7 CLI creation fails | Resource type moved to Microsoft.Bing/accounts |
| BOM in Python files on Windows | Use [System.IO.File]::WriteAllLines with UTF8Encoding($false) |
APIM endpoint doubles /openai path | Strip /openai from URL before passing to AzureOpenAI client |
| App Config 403 on first run | Wait 2–5 minutes for role assignment propagation |
| CLI Application Insights query empty | 5–10 minute ingestion lag — check portal Performance blade instead |
| AI Foundry Agent Service bypasses APIM | Use standard openai SDK pointed directly at APIM gateway |
What the Full Citadel Loop Delivers
With the agent running through APIM, every LLM call in the tool-calling loop is governed:
Content Safety — both the user question and the synthesised response pass through Azure AI Content Safety policies configured in APIM.
Token tracking — each of the two LLM calls contributes to the token usage log in Cosmos DB, giving you per-call cost attribution by APIM subscription key. The Cosmos DB ai-usage-container in the hub captures a structured document for each LLM call, including the model version, token counts, gateway region, request IP, APIM subscription name, backend routing, and timestamp. In production, the productName field maps to the APIM subscription key. Aggregating documents by this field gives you direct FinOps reporting per AI initiative.

Latency observability — Application Insights captures the duration of every call, making it easy to identify slow tool calls or model latency spikes.
Audit trail — every request is logged with timestamp, operation name, response code, and duration. For a healthcare or financial services context, this is your compliance evidence.
What’s Next
This post wires a tool-calling agent to the Citadel hub using the standard OpenAI SDK. The natural next steps:
Azure AI Foundry Agent Service routing — as the SDK matures, the azure-ai-projects client will likely gain proper APIM gateway support. Watch the azure-ai-agents release notes for updates on connection-based routing.
Conversation persistence — store conversation history in the Cosmos DB conversations container already deployed in the spoke. The App Config key CONVERSATIONS_DATABASE_CONTAINER points to it.
Network isolation — re-enable networkIsolation=true in the spoke parameters to route all traffic through private endpoints.
Multiple tools — extend the agent with additional function tools (document lookup, product catalog, claims system) using the same pattern. Each tool call flows through APIM and is governed identically.
Conclusion
Connecting a real tool-calling agent to the Microsoft Foundry Citadel Platform on Azure requires three components: the standard OpenAI SDK configured to point to the APIM gateway, a function tool with a JSON schema definition, and an explicit tool-call-handling loop. Everything else, governance, content safety, token tracking, and cost attribution, is handled by the Citadel hub automatically.
The path to get here involved navigating several SDK beta rough edges and discovering that the AI Foundry Agent Service bypasses APIM in its current preview form. These are expected friction points with a platform in active development. The governance architecture underneath is sound, the APIM policies work, and the Application Insights telemetry confirms it.
Two LLM calls. Both governed. Both visible. That is what the Citadel hub delivers.