Autoscaler

Overview

1min
use vast ai’s autoscaling system for a serverless solution that automates the provisioning of gpu workers to match the dynamic computational needs of your workloads this system ensures efficient and cost effective scaling for ai inference and other gpu computing tasks key features dynamic scaling automatically scale your ai inference up or down based on customizable performance metrics global gpu fleet leverage vast’s global fleet of powerful, affordable gpus for your computational needs fast cold start times minimize cold start times with a reserve pool of workers that can spin up in seconds metrics and debugging access ample metrics and debugging tools for your serverless usage, including logs and jupyter/ssh access performance exploration perform in depth performance exploration to optimize your autoscaling based on performance and price metrics custom worker types define custom worker types through cli search filters and create commands, supporting multiple worker types per endpoint