Skip to content

docs: correct dynamic batching visualization and timings#153

Open
Lazytoucan wants to merge 1 commit into
triton-inference-server:mainfrom
Lazytoucan:docs/fix_dynamic_batching_readme
Open

docs: correct dynamic batching visualization and timings#153
Lazytoucan wants to merge 1 commit into
triton-inference-server:mainfrom
Lazytoucan:docs/fix_dynamic_batching_readme

Conversation

@Lazytoucan

@Lazytoucan Lazytoucan commented Jun 27, 2026

Copy link
Copy Markdown

Description

This PR corrects the visualization and explanation of the Dynamic Batching behavior in the tutorial.

Previously, the diagrams and text incorrectly implied that the batcher waits for the full delay timeout even if max_batch_size has been reached. I have updated the visual diagrams to accurately show that inference execution triggers earlier — immediately when the queue reaches max_batch_size.

The text and specific timing examples in the README have been adjusted to seamlessly match the new diagrams.

Changes made

  • Replaced dynamic batching diagram PNGs with corrected versions.
  • Fixed specific time references in [Conceptual_Guide/Part_2-improving_resource_utilization/README.md].

Visual Changes

1. Dynamic Batching scheme (dynamic_batching.png)
dynamic_batching

2. Multi-Instance scheme (multi_instance.png)
multi_instance

Type of change

  • Bug fix
  • New feature
  • Documentation update (Images & Text)

Updated diagram PNGs to accurately reflect dynamic batching
behavior. Inference triggers immediately upon reaching
'max_batch_size', rather then waiting for the timeout
Ajusted launching time in the README to match the corrected
visual flow.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant