Introduction
This guide walks you through setting up and running automated backups for your Vast.ai container instances to cloud storage. Cloud backups can you help preserve your work when using Vast’s Docker-based instances. With proper backup strategies, you can ensure your valuable data remains safe and accessible even if your instance goes offline.Prerequisites
- A Vast.ai account
- Access to a Vast.ai Docker-based instance
- Cloud storage connection set up in Vast.ai
- (Optional) Install and use vast-cli
- (Optional) Understanding of how to use cron in computers with Unix-like OS
Setup
1. Setting Up Cloud Storage Connections
Before creating backup jobs, you need to ensure you have a cloud storage connection set up in your Vast.ai account. You can view your existing connections using the vast-cli:2. Understanding Backup Options
Vast.ai provides multiple approaches to schedule data backups:- Using Vast’s job scheduling system via CLI - Create hourly, daily, or weekly automated backup jobs
- Using cron on your personal computer - Schedule backups with custom timing from your local machine
Backup Methods
1. Using CLI for Scheduled Backups
The vast-cli tool allows you to create scheduled backup jobs with several timing options. The basic structure of a scheduled backup command includes these parameters:- —src /workspace specifies the source directory on your instance
- —dst /backups/19015821_backups/ is the destination folder in your cloud storage
- —instance 19015821 is your instance’s ID
- —connection 19447 is your cloud storage connection ID
- —day 6 represents Saturday (0=Sunday, 1=Monday, etc.)
- —hour 21 represents 9 PM UTC (0=12am UTC, 1=1am UTC, etc.)
2. Using Cron on Your Personal Linux Computer
If you prefer more granular control over your backup schedule, you can use cron on your local Linux or Mac computer. This approach allows for customized schedules beyond the hourly/daily/weekly options. First, open your crontab file for editing:- 0 represents the minute (0th minute of the hour)
- */4 means “every 4 hours”
- The three asterisks * * * represent day of month, month, and day of week, indicating “every day”
Viewing Scheduled Backup Jobs
To see all your currently scheduled backup jobs:Field | Description |
Scheduled Job ID | Unique identifier for your job (needed for deletion) |
Instance ID | The instance this job is associated with |
API Endpoint | The endpoint being called (rclone is used for backups to cloud storage) |
Start (Date/Time) | Start date/time of period when this scheduled job will be executed (in UTC) |
End (Date/Time) | End date/time of period when this scheduled job will be executed (in UTC). Default is the end of the contract. |
Day of the Week | Which day the job runs (can be specific day like “Wednesday”, “Saturday”, or “Everyday”) |
Hour of the Day | At what hour the job runs (formatted as 1_PM, 11_PM, 8_PM in UTC, etc.) |
Minute of the Hour | At what minute of the specified hour the job runs (00, 33, 10, etc.) |
Frequency | How often the job runs (HOURLY, DAILY, WEEKLY) |
- Job 1: A DAILY backup that runs every day at 4:00 PM UTC
- Runs daily at the same time
- Will continue running from Apr 24, 2025 until May 6, 2028
- Job 2: A WEEKLY backup that runs on Wednesdays at 10:00 AM UTC
- Runs only on Wednesdays at 10:00 AM UTC
- Short duration job (Apr 29 - May 9, 2025)
- Job 3: A HOURLY backup that runs every hour of every day
- Runs every hour (1_AM, 2_AM, 3_AM, etc.)
- Will continue running for a year
Deleting Scheduled Backup Jobs
If you need to remove a scheduled backup job that you no longer want to run, you can use the delete scheduled-job command followed by the job ID:Find Job IDs to Delete
To find the ID of the job you want to delete, first run:Best Practices
Choose the Right Backup Frequency
Consider these factors when determining how often to back up your data:- How frequently your data changes
- The criticality of your data
- The cost of data loss
- The performance impact of backup operations
- The bandwidth costs of backing your data up in cloud storage
Back Up Only What You Need
Be selective about what you back up to save time and storage costs:- Focus on backing up only important data (models, results, custom code)
Verify Your Backups
Periodically check that your backups are working correctly:- Download a sample backup from cloud storage and verify its contents
- Check logs for any cloud copy failures
- Test the restoration process before you actually need it
- If contract is extended, update end_date of scheduled job