Vast Serverless is an AI infrastructure platform that lets you run compute-intensive workloads without managing GPUs, paying for execution rather than GPU rental time. It is best suited for bursty workloads such as on-demand inference, batch jobs, and other usage patterns with variable or unpredictable demand. Interacting with Vast Serverless is made easy through a powerful python SDK. In addition to the standard benefits of using a serverless infrastructure, Vast Serverless provides further cost optimization through benchmarking to take advantage of the most cost-efficient GPUs in Vast’s marketplace. This enables better, more-cost effective scaling, but does require an evaluation period for each newly created endpoint to benchmark each workload against different GPU classes.Documentation Index
Fetch the complete documentation index at: https://docs.vast.ai/llms.txt
Use this file to discover all available pages before exploring further.
Unique Features
- Benchmark-driven scaling: Automatic identification and recruitment of the best price-performance GPU to scale your unique workload.
- One endpoint, mixed hardware: Automatically leverage Vast’s wide fleet of GPUs (from consumer-grade to the highest-end GPUs) to serve your needs, with a minimum of overhead.
- Fine-grain control and transparency: Precise configurability and observability over your infrastructure gives unmatched control.