@context decorator registers an async context manager class whose lifecycle is tied to the GPU worker. Use it to load models, initialize engines, allocate GPU memory, and set up connections once at startup rather than on every request.
Defining a Context
A context class must implement the async context manager protocol —__aenter__ and __aexit__:
__aenter__runs once when the worker starts, before it enters “ready” state. Use it to load models, allocate resources, and perform any one-time setup. It must returnself(or whatever object you wantget_contextto return).__aexit__runs when the worker shuts down. Use it to close connections, free resources, or flush buffers.
Passing Arguments to Context
You can pass arguments to the context class constructor via the decorator:Accessing Context in @remote Functions
Useapp.get_context(ContextClass) inside a remote function to retrieve the initialized context instance:
get_context returns the object that __aenter__ returned. If the context class hasn’t been registered or hasn’t been entered yet, it raises a KeyError.
Multiple Contexts
You can register multiple context classes. They are all entered in parallel at startup:asyncio.gather(), independent resources (like a tokenizer and a model) load concurrently, reducing total startup time.
Lifecycle
- Registration (deploy time):
@app.context()decorators execute and register context classes - Startup (serve time): All registered contexts’
__aenter__()methods are awaited in parallel - Serving: Remote functions access contexts via
app.get_context() - Shutdown: All contexts’
__aexit__()methods are awaited in parallel