Vast Serverless relies on benchmark testing to determine the most cost-effective GPU when scaling up (which workers to recruit), routing requests (which workers have available capacity), and scaling down (which workers to release). This benchmark is part of the PyWorker configuration within the SDK and is an integral component of how Vast Serverless operates.Documentation Index
Fetch the complete documentation index at: https://docs.vast.ai/llms.txt
Use this file to discover all available pages before exploring further.
How Benchmark Testing Works
When a new Workergroup is created, the serverless engine enters a learning phase. During this phase, it recruits a variety of machine types from those specified insearch_params. Each new worker runs the user-configured benchmark and evaluates performance, which are reported to the serverless engine.
As traffic scales up and down, the serverless engine builds an application-specific understanding of cost vs. performance, which it then uses to make informed decisions about future worker recruitment and release.