Extension Guide
Creating your own PyWorker can be complex and challenging, with many potential pitfalls. If you need assistance with adding new PyWorkers, please don't hesitate to contact us.
This guide walks you through adding new backends. It is taken from the hello_world worker's README in the Vast PyWorker repository
There is a hello_world PyWorker implantation under workers/hello_world. This PyWorker is created for an LLM model server that runs on port 5001 has two API endpoints:
- /generate: generates an full response to the prompt and sends a JSON response
- /generate_stream: streams a response one token at a time
Both of these endpoints take the same API JSON payload:
We want the PyWorker to also expose two endpoints, for each of the above endpoints.
All PyWorkers should have two files:
All of the classes follow strict type hinting. It is recommended that you type hint all of your function. This will allow your IDE or VSCode with pyright plugin to find any type errors in your implementation. You can also install pyright with npm install pyright and run pyright in the root of the project to find any type errors.
data classes representing the model API are defined here. They must inherit from lib.data_types.ApiPayload. ApiPayload is an abstract class and you need to define several functions for it:
For every model API endpoint you want to use, you must implement an EndpointHandler. This class handles incoming requests, processes them, sends them to the model API server, and finally returns an HTTP response. EndpointHandler has several abstract functions that must be implemented. Here, we implement two, one for /generate, and one for /generate_stream:
We also handle GenerateStreamHandler for streaming responses. It is identical to GenerateHandler, except for the endpoint name and how we create a web response, as it is a streaming response:
You can now instantiate a Backend and use it to handle requests.
Here you can create a script that allows you test an endpoint group running instances with this PyWorker
You can then run the following command from the root of this repo to load test endpoint group: