The /get_endpoint_workers/ and /get_autogroup_workers/ endpoints return a list of GPU instances under an Endpoint and {{Worker_Group}}, respectively.

https://run.vast.ai/get_endpoint_workers/

Inputs

  • id (int): The id value of the Endpoint.
  • api_key (string): The Vast API key associated with the account that controls the Endpoint.
The api_key could alternatively be provided in the request header as a bearer token.
JSON
{
    "id": 123,
    "api_key": "$API_KEY"
}

Outputs

For each GPU instance in the Endpoint, the following will be returned:
  • cur_load(float): Current load (as defined by the PyWorker) the GPU instance is receiving per second.
  • cur_load_rolling_avg(float): Rolling average of cur_load.
  • cur_perf(float): The most recent or current operational performance level of the instance (as defined by the PyWorker). For example, a text generation model has the units of tokens generated per second.
  • disk_usage(float): Storage used by instance (in Gb).
  • dlperf(float): Measured DLPerf of the instance. DLPerf is explained here.
  • id(int): Instance ID.
  • loaded_at(float): Unix epoch time the instance finished loading.
  • measured_perf(float): Benchmarked performances (tokens/s). Set to DLPerf if instance is not benchmarked.
  • perf(float): measured_perf * reliability.
  • reliability(float): Uptime of the instance, ranges 0-1.
  • reqs_working(int): Number of active requests currently being processed by the instance.
  • status(string): Current status of the worker.
JSON
{
    "cur_load": 150,
    "cur_load_rolling_avg": 50,
    "cur_perf": 80,
    "disk_usage": 30,
    "dlperf": 105.87206734930771,
    "id": 123456,
    "loaded_at": 1724275993.997,
    "measured_perf": 105.87206734930771,
    "perf": 100.5784639818423245,
    "reliability": 0.95,
    "reqs_working": 2,
    "status": "running"
}

Example

Run the following Bash command in a terminal to receive Endpoint workers.
Bash
curl https://run.vast.ai/get_endpoint_workers/ \
-X POST \
-d '{"id" : 123, "api_key" : "API_KEY_HERE"}' \
-H 'Content-Type: application/json'

https://run.vast.ai/get_autogroup_workers/

Inputs

  • id (int): The id value of the Worker Group.
  • api_key (string): The Vast API key associated with the account that controls the Endpoint.
The api_key could alternatively be provided in the request header as a bearer token.
JSON
{
    "id": 1001,
    "api_key": "$API_KEY"
}

Outputs

For each GPU instance in the Worker Group, the following will be returned:
  • cur_load(float): Current load (as defined by the PyWorker) the GPU instance is receiving per second.
  • cur_load_rolling_avg(float): Rolling average of cur_load.
  • cur_perf(float): The most recent or current operational performance level of the instance (as defined by the PyWorker). For example, a text generation model has the units of tokens generated per second.
  • disk_usage(float): Storage used by instance (in Gb).
  • dlperf(float): Measured DLPerf of the instance. DLPerf is explained here.
  • id(int): Instance ID.
  • loaded_at(float): Unix epoch time the instance finished loading.
  • measured_perf(float): Benchmarked performances (tokens/s). Set to DLPerf if instance is not benchmarked.
  • perf(float): measured_perf * reliability.
  • reliability(float): Uptime of the instance, ranges 0-1.
  • reqs_working(int): Number of active requests currently being processed by the instance.
  • status(string): Current status of the worker.
{
    "cur_load": 150,
    "cur_load_rolling_avg": 50,
    "cur_perf": 80,
    "disk_usage": 30,
    "dlperf": 105.87206734930771,
    "id": 123456,
    "loaded_at": 1724275993.997,
    "measured_perf": 105.87206734930771,
    "perf": 100.5784639818423245,
    "reliability": 0.95,
    "reqs_working": 2,
    "status": "running"
}

Example

Run the following Bash command in a terminal to receive Worker Group workers.
Bash
curl https://run.vast.ai/get_autogroup_workers/ \
-X POST \
-d '{"id" : 1001, "api_key" : "API_KEY_HERE"}' \
-H 'Content-Type: application/json'