/get_endpoint_workers/
and /get_autogroup_workers/
endpoints return a list of GPU instances under an Endpoint and {{Worker_Group}}, respectively.
https://run.vast.ai/get_endpoint_workers/
Inputs
id
(int): The id value of the Endpoint.api_key
(string): The Vast API key associated with the account that controls the Endpoint.
api_key
could alternatively be provided in the request header as a bearer token.
JSON
Outputs
For each GPU instance in the Endpoint, the following will be returned:cur_load
(float): Current load (as defined by the PyWorker) the GPU instance is receiving per second.cur_load_rolling_avg
(float): Rolling average ofcur_load
.cur_perf
(float): The most recent or current operational performance level of the instance (as defined by the PyWorker). For example, a text generation model has the units of tokens generated per second.disk_usage
(float): Storage used by instance (in Gb).dlperf
(float): Measured DLPerf of the instance. DLPerf is explained here.id
(int): Instance ID.loaded_at
(float): Unix epoch time the instance finished loading.measured_perf
(float): Benchmarked performances (tokens/s). Set to DLPerf if instance is not benchmarked.perf
(float):measured_perf
*reliability
.reliability
(float): Uptime of the instance, ranges 0-1.reqs_working
(int): Number of active requests currently being processed by the instance.status
(string): Current status of the worker.
JSON
Example
Run the following Bash command in a terminal to receive Endpoint workers.Bash
https://run.vast.ai/get_autogroup_workers/
Inputs
id
(int): The id value of the Worker Group.api_key
(string): The Vast API key associated with the account that controls the Endpoint.
api_key
could alternatively be provided in the request header as a bearer token.
JSON
Outputs
For each GPU instance in the Worker Group, the following will be returned:cur_load
(float): Current load (as defined by the PyWorker) the GPU instance is receiving per second.cur_load_rolling_avg
(float): Rolling average ofcur_load
.cur_perf
(float): The most recent or current operational performance level of the instance (as defined by the PyWorker). For example, a text generation model has the units of tokens generated per second.disk_usage
(float): Storage used by instance (in Gb).dlperf
(float): Measured DLPerf of the instance. DLPerf is explained here.id
(int): Instance ID.loaded_at
(float): Unix epoch time the instance finished loading.measured_perf
(float): Benchmarked performances (tokens/s). Set to DLPerf if instance is not benchmarked.perf
(float):measured_perf
*reliability
.reliability
(float): Uptime of the instance, ranges 0-1.reqs_working
(int): Number of active requests currently being processed by the instance.status
(string): Current status of the worker.
Example
Run the following Bash command in a terminal to receive Worker Group workers.Bash