Overview
Vast is a GPU marketplace. Hosts sell GPU resources on the marketplace. Hosts are responsible for:
- Setup: installing Ubuntu, creating disk partitions, installing NVIDIA drivers, opening network ports on the router and installing the Vast hosting software.
- Testing and troubleshooting all issues that can arise, such as driver conflicts, errors, bad GPUs, and bad network ports. Vast does not offer support for getting your machine working. There is a host discord with helpful members and the host-general channel is searchable for specific errors.
- Managing the listings and GPU offers for rentals, including setting pricing and end dates for the offers
- Planning for maintenance so that no client jobs are affected
You must create a new account for hosting. If you are using Vast.ai as a client, do not use the same account. A single client and hosting account is not supported and you will quickly run into issues.
Once your account is created, open the host setup guide. There is a link in the first paragraph to the hosting agreement. Read through the agreement. Once you accept, your account will then be converted to a hosting account. You will notice there is now a link to Machines in the navigation, along with some other changes. Your account can now list machines that are running the daemon software.
The host setup guide is the official documentation for setting up a machine on Vast.ai. Read through each section closely.
Common issues to check:
- Make sure to test the networking. Clients require open ports to directly connect to the machine for most jobs.
- Make sure to read the section on IOMMU if you have an AMD EPYC system.
- Make sure to disable auto-updates so that your machine doesn't drop a client job to update a driver.
Once you are ready to list your machine, come back to this guide to understand pricing and listing the rental contract.
Clients have high expectations coming from AWS or GCP. As a host, plan to offer 100% uptime for your machine during the contracted period. Expect that the GPU is going to be used at close to max capacity for the rental period. Ensure that your Internet, power source and heat dissipation systems are all functioning and that you have thought through how hosting will affect each one of those items.
Hosts can create machine listings (offers) through the CLI command list machine or the machine control panel GUI on the host machines page.
The main listing parameters include:
- the pricing for GPUs,internet,storage
- the min_gpu param controlling 'slicing' (explained below)
- the end/expiration date which determines how long the listing lasts
The listing offer is good until the end date. When a client creates an instance on your machine, this creates a contract from your listing.
Once you list and get rental contracts, it is very important to honor the terms of the contract until the end date.
By listing your machine or compute services, you are offering up a rental contract to potential clients.
Once a client accepts this listing, you and the client have entered into a rental agreement - a contract.
As the provider you are promising to provide the services as advertized in your listing:
- the provider must provide the hardware/services according to all the advertized specs
- the hardware can not be used for any other purposes
- the client's data must be isolated and protected according to the data protection policy
- the advertized services must be provided up until the end date (contract expiration)
For full details, see the hosting agreement and Service Level Agreement.
The expiration date can be set in the hosting interface by clicking on the date field under expiration and selecting a date for when the listing contract will expire. The CLI command to 'list machine' includes a field for end date, which is the same date.
Make sure to set an end date before listing your machine, or else the listing will not expire.
The "client end date" is the date of the longest client contract on a given machine.
When clicking on the set pricing button, there is a min GPU field. The min GPU field allows you to set the smallest grouping of GPU rentals available on your machine in powers of 2, or down to 1. For example, if you have an 8X 3090 and set min gpu to 2, clients can create instances with 2, 4, or 8 GPUs. If you set min gpus to 1, then clients can make instances with 1, 2, 4 or 8 GPUs.
The on-demand price is the price per hour for the GPU rental. On demand rentals are the highest priority and if met will stop interruptibles.
The interruptible price allows for the host to set the minimum interruptible price for a client to rent. Interruptibles work in a bidding system: clients set a bid price for their instance; the current highest bid is the instance that runs, the others are paused. more info
Reserved Instance Discounts are a feature for clients which allows them to rent machines over a long period of time at a reduced price. The Reserved Discount Pricing Factor represents the maximum possible discount a user can achieve on your machines.
The reserved discount pricing factor is a decimal value that represents the maximum discount a client can achieve on your contract. Eg: 0.4 represents a maximum of 40% discount.
This discount is not static, but rather scales over time that the user rents the machine for. EG: A client will get a 20% discount rate for 1 month, and a 30% discount rate for 3 months on the default setting.
You can set this number yourself to 0 if you wish to opt out of this feature.
To extend the current contracts for all clients on a given machine, change the expiration date to a later time with the same or lower pricing.
If you have raised the pricing, you cannot extend the current contract.
It is vital to test your own machine to ensure the ports and software is running smoothly.
There are two supported ways to test your own machine. If you want to use the website GUI, you will need to setup a new account on a different email address, add a credit card and then find your machine and create instances on it like a client. This has the benefit of showing you the entire client experience. Testing the recommended Pytorch template is vital to ensure that SSH and Jupyter are working properly.
The preferred method of testing your own machine is to run the CLI. For Windows users, we suggest setting up WSL which will require you to install Ubuntu on your Windows machine and change your bios settings to allow virtualization. Then you can start an Ubuntu terminal and run the CLI.
To rent your own machine you will need to first search the offers with your machine ID to find the ID and then create an instance using that ID. The show machine command will show all your connected machines.
Then for each machine id you will need to find the available instance IDs.
Replace 12345 with your actual machine ID in question. You can see the number of available listings as well as information about the machine. This is the fastest way to also see all the offers listed for a given machine. The website GUI stacks similar offers and so it is not easy to see all the listings for a given machine. That is not a problem for the CLI.
Take the ID number from the first column and use that to create a free instance on your own machine. This example loads the latest pytorch image along with both jupyter and ssh direct launch modes.
You can then look at your instance tab to make sure that pytorch loaded correctly along with jupyter and ssh. Click on the <_ button to get the ssh command to connect to the instance. Test the direct ssh command. Click on the open button to test jupyter. If the button is stuck "connecting" then there is most likely a problem with the port configuration on the router in front of the machine. Once finished, destroy the instance.
The proper way to perform maintenance on your machine is to wait until all active contracts have expired or the machine is vacant.
Unlisting will prevent new contracts from starting on the machine. However if you have a current client rental, you could set the end date to the client end date to allow for other clients to create instances on that machine that expire at the same date. Once the end date is reached, you can then unlist the machine and then perform maintenance.
For unplanned or unscheduled maintenance, use the CLI and the schedule maint command. That will notify the client that you have to take the machine down and that they should save their work. You can specify a date and duration.
To uninstall, use the Vast uninstall script located at https://s3.amazonaws.com/vast.ai/uninstall.
Hosting on Vast will require some Linux knowledge, as you will be maintaining a server. Our setup guide is here. After the first paragraph of the guide there is a link to the hosting agreement. Once you agree, your account will be converted to a hosting account. You can review our FAQ that answers many of your hosting questions.
You can create an invoice by going to the "Billing" page, and then click the box for "Include Charges" under "Generate Billing History".
If your machine seems unlisted, try this command vastai search offers 'machine_id=MACHINE_ID verified=any' to see if the CLI finds it. If there is a result, your machine is properly listed
Verification is conducted in a randomized and automated fashion. We only run manual verification tests are for datacenters and high end machines.
Verification is mostly for higher end machines, mining rigs may never be verified. Verification is also based on supply vs demand and is machine/gpu specific. Right now the only machines which can expect fast verification are $10k+: H100 or A100 80GB - if not tested quickly in a day or so let us know. 8x4090, 4xA6000 - should be tested in less than a week, especially if you have a number of them The only manual verification tests are for datacenters and high end machines. For everything else we run more random auto verification roughly about once a week. For datacenter partner inquiries email us at [email protected] directly.
To apply for datacenter status we have a number of requirements. There is a minimum number of servers and the datacenter where the equipment is located will need to have a third party certification such as ISO 27001. Please read the complete requirement list and application instructions here.
You can use the uninstall script
For help with machine setup, specific questions about hardware, and for errors or other issues, go to our discord.
You won't be able to see it on the GUI right away, but you can search using the CLI.
Can I send a message to a customer using my machine letting them know that I fixed an issue that they were having?
No, there is not an established process for hosts to message customers on Vast.
I fear I will decrease my reliability from restarting my machine and potentially lose my verification.
Your machine's reliability does not directly affect your verification standing. Verification is independent of reliability. Though, whenever taking your machine offline and working on it you should procede with caution as it is easy to introduce new issues or errors that will cause your machine to be de-verified.
To get an understanding of prices, the best place is 500farms which is a third party website that monitors Vast listings. The link is here.
If the machine loses connection or if there is a client instance that does not want to start the machine's reliability will drop.
Do not take your machine offline. If you must take your machine offline, minimize the time you have it offline. Note: reliability takes into account the average earnings of the machine, and machines with less earnings get penalized less from offline time.
Prior images are cached.
My storage for clients is somehow full. I just have a few jobs stored in my server and most of them are old and didn't delete once the job finished. A lot of them are really old, can I remove them to free up some space?
We suggest that you try cleaning up the docker build cache, as it sometimes frees up far more space than it claims. You can also clean up old unused images.
If your machine seems unlisted, try this command vastai search offers 'machine_id=MACHINE_ID verified=any' to see if the CLI finds it. If there is a result, your machine is properly listed.
There are over 10,000+ listings on Vast, and search only displays a small subset. You will usually not be able to find any one specific machine through most normal searches. This is expected and intentional behavior of our system. You can use vastai search offers 'machine_id=MACHINE_ID verified=any', to see your machine's listing. If you want to get an understanding of the machines ranking above yours you can use very narrow filters to see what similar machines are ranking above you. For example, something like: vastai search offers 'gpu_name=RTX_4090 cpu_ram>257 cpu_ram<258' is a decently constrained search that will most likely include a given machine you are looking for (that fits these filters) amongst others that are similar. Keep in mind our Auto Sort that search offers defaults to is comprised of both ranking various factors as well as an element of randomness.