Autoscaler
Endpoints
Worker List
10min
t he /get endpoint workers/ and /get autogroup workers/ endpoints return a list of gpu instances under an {{endpoint}} and {{worker group}} , respectively https //run vast ai/get endpoint workers/ inputs id (int) the id value of the endpoint api key (string) the vast api key associated with the account that controls the endpoint the api key could alternatively be provided in the request header as a bearer token { "id" 123, "api key" "$api key" } outputs for each gpu instance in the endpoint, the following will be returned cur load (float) current load (as defined by the {{pyworker}} ) the gpu instance is receiving per second cur load rolling avg (float) rolling average of cur load cur perf (float) the most recent or current operational performance level of the instance (as defined by the pyworker) for example, a text generation model has the units of tokens generated per second disk usage (float) storage used by instance (in gb) dlperf (float) measured dlperf of the instance dlperf is explained here id (int) instance id loaded at (float) unix epoch time the instance finished loading measured perf (float) benchmarked performances (tokens/s) set to dlperf if instance is not benchmarked perf (float) measured perf reliability reliability (float) uptime of the instance, ranges 0 1 reqs working (int) number of active requests currently being processed by the instance status (string) current status of the worker { "cur load" 150, "cur load rolling avg" 50, "cur perf" 80, "disk usage" 30, "dlperf" 105 87206734930771, "id" 123456, "loaded at" 1724275993 997, "measured perf" 105 87206734930771, "perf" 100 5784639818423245, "reliability" 0 95, "reqs working" 2, "status" "running" } example run the following bash command in a terminal to receive endpoint workers curl https //run vast ai/get endpoint workers/ \\ x post \\ d '{"id" 123, "api key" "api key here"}' \\ h 'content type application/json' https //run vast ai/get autogroup workers/ inputs id (int) the id value of the worker group api key (string) the vast api key associated with the account that controls the endpoint the api key could alternatively be provided in the request header as a bearer token { "id" 1001, "api key" "$api key" } outputs for each gpu instance in the worker group, the following will be returned cur load (float) current load (as defined by the {{pyworker}} ) the gpu instance is receiving per second cur load rolling avg (float) rolling average of cur load cur perf (float) the most recent or current operational performance level of the instance (as defined by the pyworker) for example, a text generation model has the units of tokens generated per second disk usage (float) storage used by instance (in gb) dlperf (float) measured dlperf of the instance dlperf is explained here id (int) instance id loaded at (float) unix epoch time the instance finished loading measured perf (float) benchmarked performances (tokens/s) set to dlperf if instance is not benchmarked perf (float) measured perf reliability reliability (float) uptime of the instance, ranges 0 1 reqs working (int) number of active requests currently being processed by the instance status (string) current status of the worker { "cur load" 150, "cur load rolling avg" 50, "cur perf" 80, "disk usage" 30, "dlperf" 105 87206734930771, "id" 123456, "loaded at" 1724275993 997, "measured perf" 105 87206734930771, "perf" 100 5784639818423245, "reliability" 0 95, "reqs working" 2, "status" "running" } example run the following bash command in a terminal to receive worker group workers curl https //run vast ai/get autogroup workers/ \\ x post \\ d '{"id" 1001, "api key" "api key here"}' \\ h 'content type application/json'