Deploy Mode
When you import and call a function from your deployment file, the SDK handles everything needed to get your code running on remote GPUs:- Packaging: All files, secrets, and scripts associated with your deployment are bundled into a tarball
- Hashing: A SHA-256 content hash determines whether anything has changed since the last deploy
- Registration: The deployment configuration is sent to the Vast API
- Uploading: If the code has changed, the tarball is uploaded to cloud storage
- Provisioning: A managed Serverless endpoint and workergroup are created according to your autoscaling settings
app.ensure_ready().
Serve Mode
When your Serverless workers start up, they enter serve mode:- Download: Workers pull your deployment code from cloud storage
- Install: Pip packages, apt packages, and startup scripts are executed
- Context loading: All
@contextclasses have their__aenter__()methods called in parallel - Benchmarking: The worker runs the benchmark defined on your
@remotefunction to produce a performance score - Ready: The worker begins accepting and executing remote function calls
Calling @remote Functions
When you call a@remote function from your client code:
- The SDK waits for the deployment to be set up and for workers to be ready
- Function arguments are serialized and routed to the quickest available worker
- The worker deserializes the arguments, executes the function with full GPU access and loaded contexts
- The return value is serialized and sent back to your local function call
await call.
Update Tiers
Whenever you make changes to your deployment, the SDK determines the minimal update required to get your latest code onto your live endpoint.Tier 0: No Changes
If your deployment is identical to the last time you ran it, no changes are needed and you connect to your endpoint immediately.Tier 1: Autoscaling Changes
If the deployed code and settings are the same but you are tweaking autoscaling parameters, the SDK updates your endpoint and workergroup settings without re-uploading code or restarting workers.Tier 2: Code Changes
If you change the contents of code, scripts, or package requirements — but don’t change the image, environment variables, search filters, or secrets — a soft-update is issued. This uploads the updated code and signals your endpoint to pull the latest version and reinstall requirements, without destroying existing workers.Tier 3: Image Changes
If your Docker image has changed, or you need to run fresh with new environment variables, the SDK issues a hard-update. This re-uses the same workers but updates their image. It takes longer than a soft-update since it may require pulling a new Docker image and re-populating worker storage.Tier 4: Forced Redeploy
This happens when the tag of your Deployment changes. It creates an entirely new Serverless endpoint and workergroup with separate routing. Use this when a new version of your deployment is not backwards compatible with workers serving an older version. It requires recruiting entirely new workers.Deployment Lifecycle
By default, deployments and their endpoints exist indefinitely after a client first sets them up. You can configure automatic teardown after a specified number of seconds since the last client connection using thettl parameter: