Skip to main content
The /route/ endpoint calls on the serverless engine to retrieve a GPU instance address within your Endpoint. Request lifetimes are tracked with a request_idx. If you wish to retry a request that failed without incurring additional load, you may use the request_idx to do so.

POST https://run.vast.ai/route/

Inputs

  • endpoint(string): Name of the Endpoint.
  • api_key(string): The Vast API key associated with the account that controls the Endpoint. The key can also be placed in the header as an Authorization: Bearer.
  • cost(float): The estimated compute resources for the request. The units of this cost are defined by the PyWorker. The serverless engine uses the cost as an estimate of the request’s workload, and can scale GPU instances to ensure the Endpoint has the proper compute capacity.
  • request_idx(int): A unique request index that tracks the lifetime of a single request. You don’t need it for the first request, but you must pass one in to retry a request.
JSON
{
    "endpoint": "YOUR_ENDPOINT_NAME",
    "api_key": "YOUR_VAST_API_KEY",
    "cost": 242.0,
    "request_idx": 2421 # Only if retrying
}

Outputs

On Successful Worker Return

  • url(string): The address of the worker instance to send the request to.
  • reqnum(int): The request number corresponding to this worker instance. Note that workers expect to receive requests in approximately the same order as these reqnums, but some flexibility is allowed due to potential out-of-order requests caused by concurrency or small delays on the proxy server.
  • signature(string): The signature is a cryptographic string that authenticates the url, cost, and reqnum fields in the response, proving they originated from the server. Clients can use this signature, along with the server’s public key, to verify that these specific details have not been tampered with.
  • endpoint(string): Same as the input parameter.
  • cost(float): Same as the input parameter.
  • request_idx(int): If it’s a new request, check this field to get your request_idx. Use this in calls to route if you wish to “retry” this request (in case of failure).
  • __request_id (string): The __request_id is a unique string identifier generated by the server for each individual API request it receives. This ID is created at the start of processing the request and included in the response, allowing for distinct tracking and logging of every transaction.
JSON
{
    "endpoint": "YOUR_ENDPOINT_NAME",
    "url": "http://192.168.1.10:8000",
    "cost": 242.0,
    "reqnum": 12345,
    "signature": "a1b2c3d4e5f60708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f40",
    "__request_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}

On Failure to Find Ready Worker

  • endpoint: Same as the input parameter to /route/.
  • status: The breakdown of workers in your endpoint group by status.

Example: Hitting route with cURL

Curl
curl --location 'https://run.vast.ai/route/' \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_TOKEN_HERE' \
--data '{
  "endpoint": "your_endpoint_name",
  "cost": 100
}'