> ## Documentation Index
> Fetch the complete documentation index at: https://docs.vast.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Deployments Overview

> An overview of Vast Deployments, the quickest way to run GPU code and set up endpoints in the Vast Cloud.

<Note>Deployments are currently in **beta**. APIs and behavior may change as the feature evolves.</Note>

## Installation

To get started with Deployments, install the Vast SDK:

```bash theme={null}
pip install vastai
```

Vast Deployments let you run Python functions on remote GPUs with a single decorator. You define everything in one `.py` file (your code, your Docker image, your GPU requirements, and your autoscaling settings), and the SDK handles packaging, uploading, provisioning workers, and routing function calls automatically.

## Why Deployments?

Deployments abstract away the infrastructure of GPU serverless computing. Instead of configuring endpoints, workergroups, and PyWorkers separately, you write Python functions and call them as if they were local. Under the hood, the SDK:

* Packages and uploads your code to the cloud
* Creates a managed Serverless endpoint and workergroup
* Provisions GPU workers with your specified image and requirements
* Installs packages, loads secrets, and runs startup scripts
* Routes function calls to ready workers and returns results

## Core Concepts

### @remote Functions

The `@remote` decorator marks an async Python function for remote execution. When you call it from your local machine, the SDK serializes the arguments, routes the call to a GPU worker, executes the function, and returns the result, all through a single `await` call.

```python theme={null}
from vastai import Deployment

app = Deployment(name="my-app")

@app.remote(benchmark_dataset=[{"x": 2}])
async def square(x):
    return x * x
```

### @context Classes

Context classes load heavy resources (models, engines, connections) once at worker startup and make them available to all remote function calls. This avoids reloading a model on every request.

```python theme={null}
@app.context()
class MyModel:
    async def __aenter__(self):
        self.model = load_model()
        return self
    async def __aexit__(self, *exc):
        pass
```

### Image Configuration

The `Image` object configures the Docker image, pip/apt packages, environment variables, GPU requirements, and startup scripts for your workers.

```python theme={null}
image = app.image("vastai/pytorch:@vastai-automatic-tag", 16)
image.pip_install("torch", "transformers")
image.require(gpu_name.in_([RTX_4090, RTX_5090]))
```

### Benchmarks

Each deployment defines a benchmark that runs when workers start up. The benchmark measures worker performance, which the autoscaler uses to determine capacity and make scaling decisions.

## Minimal Example

Here is a complete deployment in a single file:

```python theme={null}
# deploy.py
from vastai import Deployment
from vastai.data.query import gpu_name, RTX_4090, RTX_5090

app = Deployment(name="square")

@app.remote(benchmark_dataset=[{"x": 2}])
async def square(x):
    return x * x

image = app.image("vastai/base-image:@vastai-automatic-tag", 16)
image.require(gpu_name.in_([RTX_4090, RTX_5090]))
app.configure_autoscaling(min_load=1000)
app.ensure_ready()
```

And a client that calls it:

```python theme={null}
# client.py
import asyncio
from deploy import app, square

async def main():
    result = await square(5)  # Executes on a remote GPU, returns 25
    print(result)

asyncio.run(main())
```

## Next Steps

<CardGroup cols={2}>
  <Card title="Architecture" icon="sitemap" href="/guides/serverless/deployments/architecture">
    Understand how deploy mode, serve mode, and update tiers work
  </Card>

  <Card title="Configuring Deployments" icon="gear" href="/guides/serverless/deployments/configuration">
    Image, packages, GPU requirements, autoscaling, and environment setup
  </Card>

  <Card title="@remote Functions" icon="bolt" href="/guides/serverless/deployments/remote-functions">
    Define, call, and benchmark remote GPU functions
  </Card>

  <Card title="@context Classes" icon="layer-group" href="/guides/serverless/deployments/context">
    Load models and resources once at worker startup
  </Card>
</CardGroup>
