Skip to main content
Whisper is a general-purpose speech recognition model trained on a large dataset of diverse audio. Go through the Readme first before using. Connecting to the Instance
  1. Go to the templates tab and search for “Whisper” or click the provided link to the template here .
  2. After you select the template by pressing the triangle button the next step is to choose a gpu.
3. Select a GPU Offering The template you selected will give your instance access to both Jupyter and SSH. Additionally the Open button will connect you to the instance portal web interface. 4. HTTP and token-based auth are both enabled by default. To avoid certificate errors in your browser, please follow the instructions for installing the TLS certificate here to allow secure HTTPS connections to your instance via its IP. 5. Use the open button to open up the instance, if you are not using the open button the default username will be: vastai , and the password will be the value of the environment variable: OPEN_BUTTON_TOKEN. You can also find the token value by accessing the terminal and executing this command: echo $OPEN_BUTTON_TOKEN 6. After accessing the SwaggerUi by clicking the triangle button first then waiting for the page to load, then clicking into the link aligning with SwaggerUI you should see the page below. (note: usually loads fast but can take 5-10 minutes) Usage Two POST endpoints are exposed in this template: /detect-language Use this endpoint to automatically detect the spoken language in a given audio file. /asr Use this endpoint for both transcription and translation of audio files. Both of these endpoints are documented using the OpenAPI standard and can be tested in a web browser. 7. Select the detect language endpoint 8. Then click try it out. 9. From here upload an audio clip 10. Then press the execute button. 11. If you look in the response body (see below) you can see it was able to detect the language was English. Note: If you are getting an internal 500 error its most likely the file you selected to upload is to large. For more information and specifics on things such as but not limited to Configuration, Additional Functionality, Instance Logs, Cloudflared, Api request, ssh tunnels and port reference mapping, and Caddy you can visit the Readme linked here to learn more. Links
I