#dynamic-scaling

[ follow ]
Artificial intelligence
fromInfoQ
4 days ago

NVIDIA Dynamo Planner Brings SLO-Driven Automation to Multi-Node LLM Inference

Automated resource planning and SLO-based dynamic scaling optimize GPU allocation for disaggregated LLM inference on AKS, improving throughput and operational efficiency.
[ Load more ]