Conversation
There was a problem hiding this comment.
Pull request overview
Introduces a full Python “Weather Agent” sample using the Microsoft 365 Agents SDK, demonstrating Azure OpenAI-driven tool calling (OpenWeatherMap + datetime) plus OpenTelemetry/Aspire observability in an aiohttp host.
Changes:
- Added tool modules for current weather, forecast, and current date/time responses.
- Added an aiohttp app entrypoint wiring the agent, auth middleware, and telemetry.
- Added a telemetry package providing OpenTelemetry configuration, middleware, metrics, and A365 baggage propagation.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| test_samples/weather-agent-framework/tools/weather_tools.py | Implements current weather + 5-day forecast tool functions using OpenWeatherMap (pyowm). |
| test_samples/weather-agent-framework/tools/datetime_tools.py | Adds a simple datetime tool for “current time” queries. |
| test_samples/weather-agent-framework/tools/init.py | Exposes tool functions for import. |
| test_samples/weather-agent-framework/telemetry/agent_otel_extensions.py | Adds OpenTelemetry setup, exporters, auto-instrumentation, and aiohttp tracing middleware/health routes. |
| test_samples/weather-agent-framework/telemetry/agent_metrics.py | Adds custom counters/histograms and span helpers used by the agent. |
| test_samples/weather-agent-framework/telemetry/a365_otel_wrapper.py | Adds A365-style baggage propagation wrapper for operations. |
| test_samples/weather-agent-framework/telemetry/init.py | Public telemetry package surface for app/agent imports. |
| test_samples/weather-agent-framework/requirements.txt | Adds dependencies for the sample (SDK, OpenAI/Azure, OTel, pyowm). |
| test_samples/weather-agent-framework/app.py | Creates the aiohttp app, config, middleware, and message endpoints. |
| test_samples/weather-agent-framework/agents/weather_agent.py | Implements WeatherAgent (Azure OpenAI orchestration, tool calls, conversation history). |
| test_samples/weather-agent-framework/agents/init.py | Exposes WeatherAgent for imports. |
| test_samples/weather-agent-framework/README.md | Documents setup, configuration, running, and Aspire dashboard usage. |
| test_samples/weather-agent-framework/.env.example | Provides environment variable template for OpenAI, weather API, auth, and telemetry. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for i, (date, weathers) in enumerate(list(daily_forecasts.items())[:5]): | ||
| # Get temperature range for the day | ||
| temps = [w.temperature('fahrenheit').get('temp', 0) for w in weathers] | ||
| temp_min = min(temps) if temps else 'N/A' | ||
| temp_max = max(temps) if temps else 'N/A' | ||
|
|
||
| # Get most common weather condition | ||
| statuses = [w.detailed_status for w in weathers] | ||
| status = max(set(statuses), key=statuses.count) if statuses else 'Unknown' | ||
|
|
||
| result_lines.append( | ||
| f"Day {i+1} ({date}):\n" | ||
| f" Low: {temp_min:.1f}°F / High: {temp_max:.1f}°F\n" | ||
| f" Conditions: {status.capitalize()}" | ||
| ) |
There was a problem hiding this comment.
temp_min / temp_max can become the string "N/A" when temps is empty, but the f-string formats them with :.1f, which will raise TypeError. Consider keeping these values numeric (e.g., None) and branching formatting per case, or formatting as strings without numeric format specifiers when data is missing.
| conditions, humidity, and wind speed. | ||
|
|
||
| Raises: | ||
| ValueError: If the location cannot be found or API key is invalid. |
There was a problem hiding this comment.
The docstring says this function raises ValueError, but the implementation catches all exceptions and returns an error string instead. Either update the docstring to match the “returns error string” behavior, or narrow the exception handling and re-raise a ValueError so callers can handle failures reliably.
| conditions, humidity, and wind speed. | |
| Raises: | |
| ValueError: If the location cannot be found or API key is invalid. | |
| conditions, humidity, and wind speed. If the location cannot be found or the | |
| API key is invalid, a human-readable error message string is returned instead | |
| of raising an exception. |
| except Exception as e: | ||
| error_msg = f"Unable to retrieve weather for {location}, {state}: {str(e)}" | ||
| print(error_msg) | ||
| return error_msg |
There was a problem hiding this comment.
The docstring says this function raises ValueError, but the implementation catches all exceptions and returns an error string instead. Either update the docstring to match the “returns error string” behavior, or narrow the exception handling and re-raise a ValueError so callers can handle failures reliably.
| def configure_opentelemetry( | ||
| app_name: str = "A365.AgentFramework", | ||
| environment: str = "development", | ||
| otlp_endpoint: Optional[str] = None, | ||
| azure_monitor_connection_string: Optional[str] = None, | ||
| ) -> None: |
There was a problem hiding this comment.
configure_opentelemetry() accepts app_name, but the OpenTelemetry Resource is always created with SERVICE_NAME: _SOURCE_NAME, meaning the provided service name never takes effect in traces/metrics/logs. Use app_name for SERVICE_NAME (and keep _SOURCE_NAME as the tracer/meter source name) so the sample’s AGENT_NAME shows up correctly in dashboards.
| resource = Resource.create( | ||
| { | ||
| SERVICE_NAME: _SOURCE_NAME, | ||
| SERVICE_VERSION: "1.0.0", | ||
| SERVICE_INSTANCE_ID: socket.gethostname(), | ||
| "deployment.environment": environment, | ||
| "service.namespace": _SERVICE_NAMESPACE, | ||
| } | ||
| ) |
There was a problem hiding this comment.
configure_opentelemetry() accepts app_name, but the OpenTelemetry Resource is always created with SERVICE_NAME: _SOURCE_NAME, meaning the provided service name never takes effect in traces/metrics/logs. Use app_name for SERVICE_NAME (and keep _SOURCE_NAME as the tracer/meter source name) so the sample’s AGENT_NAME shows up correctly in dashboards.
| span.set_status(StatusCode.ERROR, str(ex)) | ||
| span.record_exception(ex) |
There was a problem hiding this comment.
Same Span.set_status(...) issue as in agent_otel_extensions.py: OpenTelemetry expects a Status object, not a StatusCode enum. This affects success/error status reporting in multiple helpers and can break observability or raise runtime errors.
| self.openai_client = AzureOpenAI( | ||
| azure_endpoint=endpoint, | ||
| azure_ad_token_provider=DefaultAzureCredential(), |
There was a problem hiding this comment.
azure_ad_token_provider is expected to be a token provider callable, but DefaultAzureCredential() is a credential instance. This is likely to fail when the OpenAI client tries to call it. Wrap DefaultAzureCredential().get_token(...) in a function compatible with the OpenAI SDK’s token-provider contract (or switch to the SDK’s recommended Azure AD auth pattern for AzureOpenAI).
| self.openai_client = AzureOpenAI( | |
| azure_endpoint=endpoint, | |
| azure_ad_token_provider=DefaultAzureCredential(), | |
| credential = DefaultAzureCredential() | |
| def azure_ad_token_provider() -> str: | |
| # Obtain an access token for Azure Cognitive Services | |
| token = credential.get_token("https://cognitiveservices.azure.com/.default") | |
| return token.token | |
| self.openai_client = AzureOpenAI( | |
| azure_endpoint=endpoint, | |
| azure_ad_token_provider=azure_ad_token_provider, |
| except Exception as e: | ||
| print(f"Error: {e}") | ||
| traceback.print_exc() | ||
| await context.send_activity(f"Sorry, I encountered an error: {str(e)}") |
There was a problem hiding this comment.
Returning the raw exception message to end users can leak sensitive details (configuration, internal endpoints, stack hints). Consider sending a generic failure message to the user and logging the exception details server-side instead.
| await context.send_activity(f"Sorry, I encountered an error: {str(e)}") | |
| await context.send_activity( | |
| "Sorry, I ran into a problem while processing your request. Please try again later." | |
| ) |
| # Provides AgentFrameworkInstrumentor used in telemetry/agent_otel_extensions.py | ||
| microsoft-agents-a365-runtime==0.2.1.dev33 | ||
| microsoft-agents-a365-observability-core==0.2.1.dev33 | ||
| microsoft-agents-a365-observability-extensions-agent-framework>=0.2.1.dev32 |
There was a problem hiding this comment.
Mixing pinned prerelease (.dev) versions with a >= constraint can make installs non-reproducible and harder to debug. Consider pinning all three to a consistent compatible version range (or documenting why these exact prerelease builds are required for the sample).
| microsoft-agents-a365-observability-extensions-agent-framework>=0.2.1.dev32 | |
| microsoft-agents-a365-observability-extensions-agent-framework==0.2.1.dev33 |
|
|
||
| ## Prerequisites | ||
|
|
||
| - **Python 3.11+** (3.10+ supported) |
There was a problem hiding this comment.
This line is internally inconsistent (3.11+ vs 3.10+). Clarify the minimum supported Python version (and, if relevant, which features/dependencies require 3.11 vs remain compatible with 3.10).
| - **Python 3.11+** (3.10+ supported) | |
| - **Python 3.10+** (3.11+ recommended) |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 12 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| resource = Resource.create( | ||
| { | ||
| SERVICE_NAME: _SOURCE_NAME, | ||
| SERVICE_VERSION: "1.0.0", | ||
| SERVICE_INSTANCE_ID: socket.gethostname(), | ||
| "deployment.environment": environment, | ||
| "service.namespace": _SERVICE_NAMESPACE, | ||
| } | ||
| ) | ||
|
|
||
| _configure_tracing(resource, app_name, otlp_endpoint, azure_monitor_connection_string) | ||
| _configure_metrics(resource, otlp_endpoint, azure_monitor_connection_string) | ||
| _configure_logging(resource, otlp_endpoint) | ||
| instrument_libraries() |
There was a problem hiding this comment.
configure_opentelemetry() accepts app_name, but the Resource sets SERVICE_NAME to the constant _SOURCE_NAME and the app_name value is never applied. This makes the parameter misleading and prevents callers from controlling the displayed service name in dashboards; consider using app_name for SERVICE_NAME (or remove the parameter).
| def get_weather_forecast_for_location( | ||
| location: Annotated[str, Field(description="The city name")], | ||
| state: Annotated[str, Field(description="The US state name or empty string for international cities")] | ||
| ) -> str: |
There was a problem hiding this comment.
Same contract mismatch as the current-weather tool: state is intended to be optional, but the function signature requires it. Consider defaulting state to an empty string to align with the JSON schema/tooling and avoid accidental TypeError from callers.
| except Exception as ex: | ||
| success = False | ||
| span.set_status(StatusCode.ERROR, str(ex)) | ||
| span.record_exception(ex) | ||
| raise |
There was a problem hiding this comment.
Same issue here: span.set_status(StatusCode.ERROR, ...) likely needs a Status(StatusCode.ERROR, ...) object (depending on the OTel version). Passing StatusCode directly can break span status reporting and may throw at runtime.
| def get_current_weather_for_location( | ||
| location: Annotated[str, Field(description="The city name")], | ||
| state: Annotated[str, Field(description="The US state name or empty string for international cities")] | ||
| ) -> str: |
There was a problem hiding this comment.
state is described as optional ("empty string for international cities") and the tool schema in WeatherAgent does not require it, but the function signature requires the argument with no default. Giving state a default of "" would better match the intended contract and avoid call-site errors if invoked without state.
| except Exception as e: | ||
| error_msg = f"Unable to retrieve forecast for {location}, {state}: {str(e)}" | ||
| print(error_msg) | ||
| return error_msg |
There was a problem hiding this comment.
This broad except returns an error string, which contradicts the docstring’s Raises: ValueError and makes failure handling non-typed. Consider catching specific exceptions (e.g., not-found vs auth) and either raising or returning a structured result consistently.
| Raises: | ||
| ValueError: If the location cannot be found or API key is invalid. | ||
| """ |
There was a problem hiding this comment.
Docstring states this function raises ValueError, but the implementation catches all exceptions and returns an error string instead. Align the documented contract with actual behavior (either raise on failure or document/standardize the string error return).
| Raises: | ||
| ValueError: If the location cannot be found or API key is invalid. | ||
| """ |
There was a problem hiding this comment.
Docstring says this function raises ValueError, but the implementation catches all exceptions and returns an error string instead. Please align the documented error contract with the implementation (raise vs. return error text).
| response = self.openai_client.chat.completions.create( | ||
| model=self._deployment, | ||
| messages=messages, | ||
| tools=_TOOLS, | ||
| tool_choice="auto", | ||
| temperature=0.2, | ||
| ) |
There was a problem hiding this comment.
This async handler makes synchronous network calls (self.openai_client.chat.completions.create(...)). In an aiohttp app this will block the event loop under load. Prefer using AsyncAzureOpenAI (or the async client variant for your OpenAI SDK version), or offload the sync call via asyncio.to_thread/an executor.
| if function_name == "get_current_weather_for_location": | ||
| result = get_current_weather_for_location( | ||
| location=function_args.get("location", ""), | ||
| state=function_args.get("state", ""), | ||
| ) | ||
| elif function_name == "get_weather_forecast_for_location": | ||
| result = get_weather_forecast_for_location( | ||
| location=function_args.get("location", ""), | ||
| state=function_args.get("state", ""), | ||
| ) | ||
| elif function_name == "get_date_time": |
There was a problem hiding this comment.
The tool calls here (get_current_weather_for_location, get_weather_forecast_for_location) perform synchronous I/O and are invoked from an async flow. This will block the event loop during external HTTP requests. Consider making these tools async (using an async HTTP client) or running them in a thread via asyncio.to_thread when called from the agent.
| OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST=".*" | ||
| OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_RESPONSE=".*" | ||
|
|
||
| OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST=".*" | ||
| OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE=".*" |
There was a problem hiding this comment.
Using OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_*=".*" will cause OpenTelemetry to capture and export all HTTP request and response headers (including Authorization, cookies, and API keys) into telemetry. This can leak access tokens and other secrets into backends like the Aspire dashboard or Application Insights, especially when combined with options that enable sensitive data logging. Restrict header capture to a small, non-sensitive allowlist or rely on the default configuration that excludes sensitive headers instead of using a catch‑all pattern.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Enable otel logs on AgentFramework SDK. Required for auto instrumentation | ||
| ENABLE_OTEL=true | ||
| ENABLE_SENSITIVE_DATA=true |
There was a problem hiding this comment.
ENABLE_SENSITIVE_DATA=true is a risky default for a sample because it encourages exporting potentially sensitive payloads/metadata to telemetry backends. Consider defaulting this to false (and documenting when it’s safe to enable).
| resource = Resource.create( | ||
| { | ||
| SERVICE_NAME: _SOURCE_NAME, | ||
| SERVICE_VERSION: "1.0.0", | ||
| SERVICE_INSTANCE_ID: socket.gethostname(), |
There was a problem hiding this comment.
configure_opentelemetry() accepts app_name, but the resource is created with SERVICE_NAME set to the constant _SOURCE_NAME. This ignores the caller-provided service name and can confuse dashboards; set SERVICE_NAME from app_name (and keep _SOURCE_NAME for tracer/meter naming if desired).
| # Create aiohttp app with tracing middleware. | ||
| # Equivalent to AddAspNetCoreInstrumentation() + health-check filter in C#. | ||
| app = Application(middlewares=[create_aiohttp_tracing_middleware(), jwt_authorization_middleware]) | ||
|
|
There was a problem hiding this comment.
configure_opentelemetry() enables AioHttpServerInstrumentor() and app.py also adds a custom tracing middleware that creates spans per request. This will likely produce duplicate/nested server spans for each inbound request. Prefer either the aiohttp-server instrumentor (configure its excluded URLs) or the custom middleware, but not both.
| # Global token cache for Agent 365 Observability exporter | ||
| _agentic_token_cache = {} |
There was a problem hiding this comment.
The global _agentic_token_cache dict is unbounded and has no eviction/TTL, so a long-running process that sees many (tenant_id, agent_id) pairs can grow memory indefinitely. Consider using a bounded/TTL cache (or clearing entries when tokens expire).
| #TODO: fix tenant id resolution. | ||
| tenant_id = ( | ||
| getattr(conversation, "tenant_id", None) | ||
| or getattr(recipient, "tenant_id", None) | ||
| or _EMPTY_GUID |
There was a problem hiding this comment.
Tenant ID resolution is marked TODO, but the SDK already provides activity.get_agentic_tenant_id() which resolves tenant IDs for agentic requests (recipient vs conversation). Prefer the helper to avoid caching tokens under the wrong tenant key.
| OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST=".*" | ||
| OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_RESPONSE=".*" | ||
|
|
||
| OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST=".*" | ||
| OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE=".*" |
There was a problem hiding this comment.
The default OTEL header capture settings here (".*" for request/response headers) can export secrets (Authorization, cookies, etc.) to your collector/Azure Monitor. Consider restricting captured headers to a small allowlist (or removing these lines from the example).
| self.openai_client = AzureOpenAI( | ||
| azure_endpoint=endpoint, | ||
| azure_ad_token_provider=DefaultAzureCredential(), | ||
| api_version=api_version, | ||
| ) |
There was a problem hiding this comment.
azure_ad_token_provider is being set to a DefaultAzureCredential() instance, but the OpenAI Python SDK expects a token provider callable (typically created via azure.identity.get_bearer_token_provider). Passing the credential object directly will fail at runtime when the client tries to call it to obtain a token. Create a bearer token provider for the https://cognitiveservices.azure.com/.default scope (or use api_key auth only) and pass that callable instead.
| func() | ||
| span.set_status(StatusCode.OK) | ||
| except Exception as ex: |
There was a problem hiding this comment.
Span.set_status(...) is being called with StatusCode rather than a Status instance (as expected by the OpenTelemetry Python API). This can break status setting at runtime; construct a Status object when setting span status.
| # Agent ID — use recipient identity (the agent/bot that received the turn) | ||
| recipient = getattr(activity, "recipient", None) | ||
| agent_id = str(getattr(recipient, "id", None) or _EMPTY_GUID) | ||
|
|
There was a problem hiding this comment.
agent_id is derived from activity.recipient.id, but AccessTokenProviderBase.get_agentic_application_token(tenant_id, agent_app_instance_id) expects an agent app instance id. For agentic requests the SDK exposes this as activity.get_agentic_instance_id() / recipient.agentic_app_id; using those will make observability token caching work reliably.
| span.set_status(StatusCode.OK) | ||
| return response | ||
| except Exception as ex: | ||
| span.set_status(StatusCode.ERROR, str(ex)) | ||
| span.record_exception(ex) |
There was a problem hiding this comment.
span.set_status(StatusCode.OK) / span.set_status(StatusCode.ERROR, ...) uses StatusCode directly, but OpenTelemetry expects a Status object. Construct a Status instance when setting span status to avoid runtime issues.
This pull request introduces a complete Python sample for a Weather Agent built with the Microsoft 365 Agents SDK. The sample demonstrates how to integrate Azure OpenAI, OpenWeatherMap, and OpenTelemetry (including Aspire dashboard support) in an aiohttp-based agent. Major changes include the addition of agent logic, supporting tools, telemetry integration, configuration files, and comprehensive documentation.
Agent Implementation and Application Setup
WeatherAgentclass inagents/weather_agent.py, implementing message handling, Azure OpenAI orchestration, tool-calling logic for weather and date/time lookups, and conversation history management.app.pyas the aiohttp entry point, wiring together the agent, Microsoft 365 Agents SDK, authentication, and OpenTelemetry setup, including health endpoints and middleware.agents/__init__.pyto expose theWeatherAgentfor import.Telemetry and Observability Integration
telemetrypackage (telemetry/__init__.py) with helpers for OpenTelemetry configuration, metrics, tracing, and integration with the A365 observability stack, enabling out-of-the-box telemetry export to Aspire or Azure Monitor.Configuration and Dependencies
.env.examplewith sample environment variables for Azure OpenAI, OpenWeatherMap, M365 authentication, and OpenTelemetry/Aspire dashboard integration.requirements.txtlisting all necessary dependencies, including Microsoft Agents SDK components, OpenAI, Azure Identity, aiohttp, OpenTelemetry, and weather API clients.Documentation
README.mdwith setup instructions, prerequisites, configuration guidance, project structure, and further reading for users to get started with the sample agent.