Serverless

Debugging

4 min

worker errors the vast pyworker https //github com/vast ai/pyworker/tree/main framework automatically detects some errors, while others may cause the instance to timeout when an error is detected, the serverless system will destroy or reboot the instance to manually debug an issue, check the instance logs available via the logs button on the instance page in the gui all pyworker issues will be logged here if further investigation is needed, ssh into the instance and find the model backend logs location by running echo "$model log" and pyworker logs echo "${workspace dir /workspace}/pyworker log" increasing load to handle high load on instances set test workers high create more instances initially for worker groups with anticipated high load adjust cold workers keep enough workers around to prevent them from being destroyed during low initial load increase cold mult quickly create instances by predicting higher future load based on current high load adjust back down once enough instances are created check max workers ensure this parameter is set high enough to create the necessary number of workers decreasing load to manage decreasing load reduce cold workers stop instances quickly when the load decreases to avoid unnecessary costs the serverless system will handle this automatically, but manual adjustment can help if needed

Serverless Parameters

Overview