Skip to content

fix(dynamic-instrumentation): bind FastAPI sub-dependencies, Starlette/aiohttp & functools.partial handlers, and surface non-bindable line breakpoints#787

Open
syed-ahsan-ishtiaque wants to merge 4 commits into
aws-observability:mainfrom
syed-ahsan-ishtiaque:fix/di-framework-dispatch-and-partial
Open

fix(dynamic-instrumentation): bind FastAPI sub-dependencies, Starlette/aiohttp & functools.partial handlers, and surface non-bindable line breakpoints#787
syed-ahsan-ishtiaque wants to merge 4 commits into
aws-observability:mainfrom
syed-ahsan-ishtiaque:fix/di-framework-dispatch-and-partial

Conversation

@syed-ahsan-ishtiaque

@syed-ahsan-ishtiaque syed-ahsan-ishtiaque commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Summary

Four independent Dynamic Instrumentation correctness fixes in the Python distro's debugger. Each closes a silent failure: a breakpoint that reported READY (or a status that looked healthy) but never actually fired, because the wrapper/patcher never reached the live callable — or, in the last case, a target that can never bind yet never surfaced an error. They are bundled here because they touch the same subsystem (_function_wrapper.py dispatch + the manager/status-reporter status path); each is additive and leaves the existing Flask/Django/FastAPI patch paths intact.

1. FastAPI Depends() sub-dependency handlers silently never fire

A function used as a FastAPI sub-dependency (Depends(get_thing)) is captured at import time inside the route's pre-built Dependant tree (dependant.dependencies[*].call). The per-request path invokes those call references, not the module-level name, so a function-level breakpoint on a sub-dependency reported READY but never fired (line-level worked).

Fix: _rebind_dependant walks the Dependant tree and _rebind_route rebinds every matching call (and the route's own endpoint/dependant.call) to the instrumented wrapper. Matching is identity-then-name+module (_matches/_is_identity), mirroring the existing patchers.

2. Route handlers of unsupported frameworks (Starlette, aiohttp) silently never fire

_patch_framework_references dispatched to only Flask, Django, and FastAPI. A handler on any other framework was never rebound, so a function-level breakpoint on it silently never fired — reproduced on Starlette and aiohttp.

  • _patch_starlette_routes — a pure-Starlette app builds route.app = request_response(endpoint) at import time and the per-request path calls route.app; the original handler lives in a closure that setattr/endpoint-rebind cannot reach. The patcher rebuilds route.app from the wrapper.
    • Detection is structural (_starlette_app_wraps_endpoint inspects the closure cells for the captured endpoint, handling both the async direct-capture and the sync functools.partial(run_in_threadpool, endpoint) forms) rather than matching on an internal function name, so a future Starlette rename won't silently misclassify routes.
    • FastAPI subclasses Starlette and is handled by case Add repository owners and component level owners. #1, so this patcher explicitly skips FastAPI instances (no double-patch).
    • Routes with per-route middleware are left untouched (rebuilding route.app would drop the middleware) — a deliberate, logged safe no-op. Mount/sub-routers are descended (_walk, depth-capped).
  • _patch_aiohttp_routes — matches on the public route.handler and rebinds the backing _handler (no public setter), warning if the internal attribute is ever absent.

3. functools.partial (any callable whose runtime name ≠ configured name) silently never fires

The wrapper computed its _active_functions lookup key from the target's runtime __qualname__/__name__. A functools.partial has neither, so the key became "<module>.<anonymous>" and missed the set the manager registered under the configured key "<module>.<function_name>". The fix threads the configured function_name into the wrapper and keys off it (falling back to runtime introspection only when not provided), aligning the wrapper's lookup with the manager's registration key.

4. Line breakpoint on a target with no code object reported READY instead of ERROR

A line-level breakpoint whose target resolves to a callable with no __code__ (a callable-class instance, or a functools.partial) cannot arm the line engine. Previously this was skipped silently and the config still reported READY — indistinguishable from "applied, waiting for traffic", though it could never fire.

Fix: the manager now reports a non-silent LINE_NOT_EXECUTABLE error for the line-level config(s) and records them as failed. The per-config READY report in apply_configuration and the periodic status sweep both skip already-failed configs, so the ERROR is not overwritten by a later READY.

Testing

  • Unit: new test_function_wrapper_starlette.py and test_function_wrapper_aiohttp.py (identity match, name+module fallback, the FastAPI-skip guard, the middleware safe-skip, structural-detection contract); new functools.partial test in test_function_wrapper.py; FastAPI sub-dependency rebind tests in test_function_wrapper_fastapi.py; codeless-line tests in test_instrumentation_manager.py (unit, plus an end-to-end apply_configuration test asserting READY does not overwrite ERROR) and test_status_reporter_logic.py (periodic sweep suppresses READY for a failed config). All are regression guards (verified failing without the fix). Updated pre-existing wrapper tests that keyed breakpoint sets by the runtime __qualname__ to use the configured key the manager actually registers (module.function_name).
  • Contract: new di-starlette app + starlette_test.py prove a Starlette route handler fires end-to-end; di-fastapi gains a functools.partial target + test.
  • Result: full debugger unit suite 579 passed, 25 skipped, 0 failed. The framework fixes were additionally verified end-to-end against real Starlette / aiohttp / partial targets. black/isort/flake8 clean; pylint 10.00/10 on the source.

No change to the existing Flask/Django/FastAPI-route patch paths (their tests are unaffected).

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@syed-ahsan-ishtiaque syed-ahsan-ishtiaque requested a review from a team as a code owner June 18, 2026 06:27
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Claude Code Review

Reviewing PR…

  • Read prior review comments to dedupe
  • Read full PR diff
  • Inspect critical files (wrapper, patchers, tests)
  • Post findings

View job run

…handlers (Starlette/aiohttp) and functools.partial targets

Two related silent-failure fixes in the DI function wrapper. In both, a
function-level breakpoint reported READY but never fired because the
wrapper/patcher never reached the live callable.

Unsupported-framework route handlers (Starlette, aiohttp)
  _patch_framework_references dispatched to only Flask/Django/FastAPI, so a
  route handler on any other web framework was never rebound and the breakpoint
  silently never fired. Adds two additive patchers:
    * _patch_starlette_routes: pure-Starlette apps build
      `route.app = request_response(endpoint)` at import time (the per-request
      path invokes route.app, not route.endpoint), so the handler is captured in
      a closure that setattr/endpoint-rebind cannot reach. The patcher rebuilds
      route.app from the wrapper. FastAPI subclasses Starlette and is handled by
      _patch_fastapi_routes, so this patcher explicitly SKIPS FastAPI instances
      (no double-patch). Routes carrying per-route middleware are left untouched
      (rebuilding route.app would drop the middleware) -- a safe no-op.
    * _patch_aiohttp_routes: rebinds the stored handler on aiohttp routes.
  Both match by identity then name+module (mirroring the existing patchers) and
  run in their own try/except in the dispatcher so a failure is isolated.

functools.partial (and other callables whose runtime name != configured name)
  The wrapper computed its _active_functions lookup key from the target's runtime
  __qualname__/__name__. A functools.partial has neither, so the key resolved to
  "<module>.<anonymous>" and missed the set the manager registered under the
  CONFIGURED key "<module>.<function_name>" -> the wrapper silently never fired.
  The fix threads the configured function_name into the wrapper and keys off it
  (falling back to runtime introspection only when not provided), aligning the
  wrapper's lookup with the manager's registration key. Generalizes to any
  callable whose runtime name differs from the configured name.

Tests
  * Unit: new test_function_wrapper_starlette.py (5) and
    test_function_wrapper_aiohttp.py (4) cover identity match, name+module
    fallback, the FastAPI-skip guard, the middleware safe-skip, and no-op safety;
    new partial regression test in test_function_wrapper.py. All verified failing
    without the fix.
  * Updated existing wrapper tests that registered breakpoint sets under the
    runtime __qualname__ to use the configured key the manager actually uses
    (module.function_name) -- these previously only coincidentally matched.
  * Contract: new di-starlette app + starlette_test.py prove a Starlette route
    handler fires end-to-end; di-fastapi gains a functools.partial target + test.
  Full debugger unit suite: 557 passed, 43 skipped. black/isort/flake8 clean;
  pylint 10.00/10 on the source.

No change to the Flask/Django/FastAPI patch paths, so existing behavior is preserved.
@syed-ahsan-ishtiaque syed-ahsan-ishtiaque force-pushed the fix/di-framework-dispatch-and-partial branch from 903db23 to af3b772 Compare June 18, 2026 06:31
@syed-ahsan-ishtiaque syed-ahsan-ishtiaque added the skip changelog doesn't need a CHANGELOG entry label Jun 18, 2026
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Claude finished @syed-ahsan-ishtiaque's task in 15m 45s —— View job


Claude Code Review

  • Read prior review comments to dedupe
  • Read full PR diff
  • Inspect critical files (wrapper, patchers, tests)
  • Post findings

The two fixes are well-scoped and the regression-guard tests are genuine. A few substantive concerns I'd ask the author to address before merge — none are blockers but a couple have non-trivial silent-failure risk that's worth pinning.


1. Starlette is_plain_request_response check is fragile (medium concern)

_function_wrapper.py:1424-1434 detects "plain request_response closure" by:

is_plain_request_response = (
    getattr(current_app, "__module__", None) == "starlette.routing"
    and getattr(current_app, "__qualname__", "").startswith("request_response")
)

This depends on the inner closure inside starlette.routing.request_response keeping that exact __qualname__ prefix. Starlette has historically changed this — request_response has been refactored a few times across versions, and at one point the inner function was renamed to app. If Starlette renames or restructures the inner function (or wraps the result in a class instance), every plain Starlette route would be incorrectly classified as "non-plain" and silently skipped — re-introducing exactly the silent-never-fires bug this PR is fixing, just for a future Starlette release.

Two suggestions, in order of preference:

(a) Explicit version pin + a unit test that asserts is_plain_request_response is True for request_response(handler) against the actually-installed Starlette. The existing happy-path test (test_patch_starlette_route_rebuilds_app_and_fires) demonstrates rebuild works, but it does not assert the qualname-detection branch is the one taken — if is_plain_request_response ever returned False, that test would still pass via the early assertIs(route.endpoint, wrapper) check (since you set route.endpoint = new_func before the current_app check)... wait, actually the order is if not is_plain_request_response: continue before the endpoint = new_func write, so the test would fail. Good. But the test silently becomes a tautology if the structural assumption changes — consider making the detection logic more robust per (b).

(b) Detect by structural test instead of name — e.g., probe whether current_app is a function defined in starlette.routing whose closure references the original endpoint:

is_plain_request_response = (
    callable(current_app)
    and getattr(current_app, "__module__", None) == "starlette.routing"
    and any(c.cell_contents is original_func for c in (getattr(current_app, "__closure__", None) or ()))
)

That checks the actual property you care about (the closure-captured original handler) rather than a string prefix.

(c) Always log at WARN (not DEBUG) when this branch skips a matched route. The PR comment says it's "a deliberate safe no-op" but in practice, it's also the failure mode if the Starlette internals change — and DEBUG logs are off in production, so the silent-never-fires is invisible to operators.

logger.warning(
    "Starlette route '%s' has a non-plain app (likely per-route middleware or a new "
    "Starlette internal); DI will not instrument this handler. If you did not configure "
    "per-route middleware, please file an issue.",
    getattr(route, "path", "?"),
)

Fix this →


2. aiohttp _handler private-attribute hopping is similarly fragile

_function_wrapper.py:1522-1524:

if hasattr(route, "_handler"):
    route._handler = new_func

_handler is private. Across aiohttp 3.x minor versions the route classes have varied — ResourceRoute, SystemRoute, plain RouteDef — and not all consistently expose _handler. The hasattr guard means a future aiohttp release can rename this and the patch silently no-ops for every route, with no warning. Same recommendation: log at WARN when the public route.handler property matches but _handler is missing, so the regression is observable instead of silent.

Also, the comment on line 1494 says "accessed via the public route.handler property" but you actually write to _handler directly. If aiohttp ever provides a setter on the public property, prefer it. As-is, the asymmetry (read public, write private) is worth a one-line WHY comment.


3. Snapshot metadata now uses configured name instead of runtime qualname (subtle behavior change)

In _create_sync_wrapper / _create_async_wrapper, qualified_name is used for both _active_functions lookup AND passed to _build_snapshot as snapshot metadata. The fix is correct for the lookup, but it also changes the snapshot's qualified_name field for any case where the configured name differs from runtime introspection.

Concretely, this shouldn't affect Flask/Django/FastAPI route handlers (configured name == __qualname__ for module-level functions), but for class methods configured as "ClassName.method", __qualname__ would already match. So the only behavior change is that partials and other anonymous callables now report the configured name in the snapshot instead of "<anonymous>" — which is a fix, not a regression. Worth mentioning in the PR description; downstream consumers grepping for "<anonymous>" (unlikely, but possible) would see the change.


4. Minor: mid-file import functools in di_fastapi_server.py

contract-tests/images/applications/di-fastapi/di_fastapi_server.py:108import functools is placed mid-file with # noqa: E402. Move it to the top of the file with the other imports; the # noqa is a smell that's avoidable here.


5. Test: test_patch_starlette_route_with_middleware_is_skipped doesn't assert WHY

The test confirms route.app is unchanged but doesn't directly verify the skip is via the qualname check (vs. the route not being matched at all). A stronger assertion would be that _matches(route.endpoint) returns True (so the route IS matched) but the rebuild is skipped. This pins the contract: matched-but-middleware = safe skip. As written, if the matching logic changes, the test would still pass.


Otherwise, the architecture (per-framework patchers in their own try/except in the dispatcher, identity-then-name+module fallback mirroring Flask/FastAPI/Django, FastAPI explicit-skip in the Starlette patcher to prevent double-patching, depth limit on Mount recursion) is consistent with the existing patchers and the test coverage including the genuine end-to-end ASGI drive in test_patch_starlette_route_rebuilds_app_and_fires is solid.
| Branch: fix/di-framework-dispatch-and-partial

…per review

Addresses automated review feedback on the framework-dispatch patchers.

* Starlette plain-app detection is now structural instead of name-based.
  Previously it recognized the "plain request_response closure" by checking
  current_app.__qualname__.startswith("request_response"); a future Starlette
  rename would misclassify every plain route as middleware-wrapped and silently
  skip it, re-introducing the never-fires bug. New _starlette_app_wraps_endpoint
  helper inspects route.app's closure cells for the route's live function
  (captured directly for async endpoints, inside a functools.partial for sync
  endpoints wrapped via run_in_threadpool). Accepts either the matched endpoint
  or original_func to keep the name+module fallback path working. The skip log
  is upgraded from debug to warning so a future internals change is observable
  rather than silent.

* aiohttp: warn (instead of silently no-op'ing) when a route matches by the
  public route.handler property but has no writable _handler attribute, so a
  future aiohttp rename surfaces. Added a comment explaining the read-public /
  write-private asymmetry and reconciled the docstring.

* di-fastapi contract app: moved `import functools` to the top import block and
  dropped the `# noqa: E402`.

* Tests: new test_starlette_app_wraps_endpoint_detection covers async/sync/
  middleware/non-closure inputs; the middleware-skip test now asserts the route
  IS matched (pinning "matched-but-middleware -> safe skip" rather than passing
  via no-match).

Full debugger unit suite: 558 passed, 43 skipped. black/isort/flake8 clean;
pylint 10.00/10. Async and sync Starlette handlers re-verified firing in-process.
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Claude finished @syed-ahsan-ishtiaque's task in 7m 43s —— View job


Claude Code Review

  • Read prior review comments to dedupe
  • Read full PR diff
  • Inspect critical files (wrapper, patchers, tests)
  • Post findings

The follow-up commit (48e319f) substantively addresses every concern raised in the prior review round. No new substantive findings to add.

Verified addressed from the prior review

  1. Starlette detection is now structural, not name-based_starlette_app_wraps_endpoint (_function_wrapper.py:1458-1483) walks route.app.__closure__ cells and recognises both async (endpoint captured directly) and sync (endpoint inside a functools.partial for run_in_threadpool). A future Starlette rename of the inner closure won't silently classify every plain route as "non-plain". The new test_starlette_app_wraps_endpoint_detection pins this contract independent of any closure function name.
  2. Skip path now WARNs, not DEBUGs_function_wrapper.py:1438-1443 (Starlette) and :1571-1575 (aiohttp _handler missing). If the structural detection ever rejects a real plain route (Starlette internal change) or aiohttp renames _handler, the regression becomes observable instead of silent.
  3. aiohttp public/private asymmetry documented_function_wrapper.py:1560-1565 explicitly explains why the match reads route.handler (public) but the rebind writes _handler (no public setter), with a forward-compat WARN.
  4. test_patch_starlette_route_with_middleware_is_skipped now pins matched-but-skipped — asserts route.endpoint is handler AND _starlette_app_wraps_endpoint(route.app, handler) is False (the test file at lines 918-922 in the diff), so the test exercises the safe-skip path rather than passing because the route simply didn't match.
  5. import functools moved to top of di_fastapi_server.py — line 14, no # noqa: E402 for it.

Observations (not blocking, not requiring changes)

  • The remaining mid-file from mock_di_api import … # noqa: E402 is structural (the mock API must be configured before set_breakpoint_configs(...) runs at import time) and is consistent with di-starlette/di_starlette_server.py — fine as-is.
  • mock_di_api.py in di-starlette/ is a verbatim copy of di-fastapi/mock_di_api.py save for the docstring still saying "alongside the Flask app". Cosmetic; no behavioral concern.
  • Snapshot metadata qualified_name change for partials (configured name vs. "<anonymous>") is a fix, not a regression — already noted in the prior review.

Architecture (per-framework patchers in their own try/except in the dispatcher, identity-then-name+module fallback mirroring Flask/FastAPI/Django, FastAPI explicit-skip in the Starlette patcher, depth limit on Mount recursion, structural plain-app detection, end-to-end ASGI drive in the happy-path test) remains consistent with the existing patchers. Approving in principle — nothing further from me.
| Branch: fix/di-framework-dispatch-and-partial

…rgets without a code object

A line-level breakpoint whose target resolves to a callable with no __code__
object (for example a callable-class instance or a functools.partial used as
the handler) cannot arm the line engine. Previously this was skipped silently
and the configuration still reported READY, which is indistinguishable from
"applied, waiting for traffic" — the breakpoint could never fire.

The manager now reports a non-silent LINE_NOT_EXECUTABLE error for the
line-level configuration(s) in that case and records them as failed. The
per-config READY report in apply_configuration and the periodic status sweep
both skip configurations already marked failed, so the ERROR status is not
subsequently overwritten by a misleading READY.

Tests: added a unit test for the codeless-target branch, an end-to-end test
through apply_configuration asserting READY does not overwrite the ERROR, and a
status-reporter test asserting the periodic sweep suppresses READY for a failed
config.
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Claude finished @syed-ahsan-ishtiaque's task in 6m 31s —— View job


Claude Code Review

  • Read prior review comments to dedupe
  • Inspect new commit (597b77c) since last review
  • Post findings

The new commit (597b77c) is a solid follow-up that closes a separate silent-failure mode adjacent to this PR's primary fix: a line-level breakpoint whose target resolves to a callable with no __code__ (callable-class instance, functools.partial) used to report READY indefinitely. It now reports LINE_NOT_EXECUTABLE ERROR and the failed config is recorded so subsequent READY reports cannot overwrite it.

Verified correctness

  • Branch placement (instrumentation_manager.py:454-476) — the new elif self._engine and bp_set.line_numbers and bp_set.code_object is None: correctly sits after the success path and the no-engine path; only line-level configs (line_num > 0) are marked failed, so a co-located function-level (line 0) breakpoint is unaffected. Matches the PR description.
  • Correct identifier flow_failed_configs keys by config_id; BreakpointState.location_hash is set to the same config_id (:490); the status reporter's filter (_status_reporter.py:153, 197) extracts dict keys, so the match is identifier-correct regardless of the enum vs. string value type stored.
  • READY suppression covers all pathsapply_configuration skips READY for failed configs in both the update path (:815-819) and the new-function path (:829-834); the status reporter's initial sweep also skips them (:197). The periodic sweep is naturally unaffected because the READY block there is gated by is_initial_report.
  • Stale cleanup (:766-769) drops _failed_configs entries when a config disappears from the incoming list; the update path additionally clears entries before re-applying (:794-795), so a user fixing the misconfiguration recovers cleanly.
  • No log spam — after the first ERROR, subsequent polls hit _has_changed → False (the configs haven't changed) and short-circuit before re-applying, so ERROR is emitted once per misconfiguration rather than per poll cycle.

Tests

The three new tests cover the right contract surface:

  • test_line_bp_on_codeless_target_reports_line_not_executable — unit-level: confirms the engine is never armed, the ERROR is reported with LINE_NOT_EXECUTABLE, and _failed_configs is populated.
  • test_apply_configurations_does_not_overwrite_error_with_ready — end-to-end: confirms READY is not subsequently emitted for the failed config (the original silent-failure mode).
  • test_initial_report_skips_ready_for_failed_config — pins the status-reporter side of the contract.

Minor (non-blocking) observation

InstrumentationManager.__init__ annotates self._failed_configs: Dict[str, str] = {} (:75) but the values stored are ErrorCause enum instances, both in the new path (:464) and the existing exception path (:850). This is a pre-existing annotation inaccuracy, not introduced here, and has no runtime effect since the reporter only consumes keys. Worth a one-line follow-up tightening Dict[str, str]Dict[str, ErrorCause] whenever this file is next touched.

Architecture remains consistent (per-framework patchers in their own try/except, structural detection for Starlette plain-app, WARN logs on observability-critical skips, identity-then-name+module fallback). Nothing further from me.

Branch: fix/di-framework-dispatch-and-partial |

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Claude finished @syed-ahsan-ishtiaque's task in 50s —— View job


Claude Code Review

  • Read prior review comments to dedupe
  • Read full PR diff
  • Inspect new commit (302dc20)
  • Post findings

The new commit (302dc20) is a one-line annotation tightening of _failed_configs from Dict[str, str]Dict[str, ErrorCause] — directly addresses the non-blocking observation flagged in the prior review round. ErrorCause is already imported (instrumentation_manager.py:27), the runtime behavior is unchanged, and the new annotation now matches what's actually stored at the two assignment sites (:464, :850).

Nothing further from me. Approving in principle — all prior review feedback (Starlette structural detection, WARN logs, aiohttp asymmetry comment, middleware-skip test pinning, import functools at the top of di_fastapi_server.py, line-breakpoint-on-codeless-target ERROR reporting, and now this annotation tightening) has been addressed across 48e319f, 597b77c, and 302dc20.
| Branch: fix/di-framework-dispatch-and-partial

@syed-ahsan-ishtiaque syed-ahsan-ishtiaque changed the title fix(dynamic-instrumentation): instrument Starlette/aiohttp route handlers and functools.partial targets fix(dynamic-instrumentation): bind FastAPI sub-dependencies, Starlette/aiohttp & functools.partial handlers, and surface non-bindable line breakpoints Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip changelog doesn't need a CHANGELOG entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant