Autoscaler
Overview
1min
Use Vast.ai’s Autoscaling system for a serverless solution that automates the provisioning of GPU workers to match the dynamic computational needs of your workloads. This system ensures efficient and cost-effective scaling for AI inference and other GPU computing tasks.
- Dynamic Scaling: Automatically scale your AI inference up or down based on customizable performance metrics.
- Global GPU Fleet: Leverage Vast’s global fleet of powerful, affordable GPUs for your computational needs.
- Fast Cold-Start Times: Minimize cold-start times with a reserve pool of workers that can spin up in seconds.
- Metrics and Debugging: Access ample metrics and debugging tools for your serverless usage, including logs and Jupyter/SSH access.
- Performance Exploration: Perform in-depth performance exploration to optimize your autoscaling based on performance and price metrics.
- Custom Worker Types: Define custom worker types through CLI search filters and create commands, supporting multiple worker types per endpoint.