pyworker
Introduction
5min
the vast pyworker is a python web server designed to run alongside a machine learning model instance, providing autoscaler compatibility it serves as the primary entry point for api requests, forwarding them to the model's api hosted on the same instance additionally, it monitors performance metrics and estimates current workload based on factors such as the number of tokens processed, reporting these metrics to the autoscaler overview vast's autoscaler templates use the vast pyworker the vast pyworker repository https //github com/vast ai/pyworker/ allows you to run custom code as an api server and integrate with vast’s autoscaling server to manage server start and stop operations based on performance and error metrics the pyworker code runs on your vast instance, and we automate its installation and activation during instance creation integration with backend code the vast pyworker wraps application specific backend code and calls the appropriate backend function when the corresponding api endpoint is invoked for example, if you are running a machine learning inference server, the backend code would implement the "infer" function for your model to use the pyworker with a specific backend use a launch script that starts the pyworker code install required dependencies for the backend code set up any additional requirements for your backend to run communication with autoscaler to integrate with vast's autoscaling service, each backend must send a message to the autoscaling server when the backend server is ready (e g , after model installation) periodically send performance metrics to the autoscaling server to optimize server usage and performance report any errors to the autoscaling server pyworker diagram getting started if you want to create your own backend and learn how to integrate with the autoscaling server, please refer to the following guides extension guide supported backends vast has pre created backends for popular models such as text generation inference https //github com/huggingface/text generation inference and comfy ui https //github com/comfyanonymous/comfyui these backends allow you to use these models in api mode, automatically handling performance and error tracking, making them compatible with vast's autoscaler with no additional code required to get started with vast supported backends, see the pyworker backends guide https //docs vast ai/serverless/backends for more detailed information and advanced configuration, please visit the vast pyworker repository https //github com/vast ai/pyworker/