Skip to main content

Introduction

This guide walks you through setting up and running automated backups for your Vast.ai container instances to cloud storage. Cloud backups can you help preserve your work when using Vast’s Docker-based instances. With proper backup strategies, you can ensure your valuable data remains safe and accessible even if your instance goes offline.

Prerequisites

Setup

1. Setting Up Cloud Storage Connections

Before creating backup jobs, you need to ensure you have a cloud storage connection set up in your Vast.ai account. You can view your existing connections using the vast-cli:
python3 vast.py show connections

ID     NAME                          Cloud Type
19447  karthik_vast_ai_google_drive  drive
If you don’t have a connection yet, you’ll need to set one up in Vast.ai’s Settings page before proceeding with backup operations.

2. Understanding Backup Options

Vast.ai provides multiple approaches to schedule data backups:
  • Using Vast’s job scheduling system via CLI - Create hourly, daily, or weekly automated backup jobs
  • Using cron on your personal computer - Schedule backups with custom timing from your local machine
Both approaches have their advantages depending on your workflow and requirements.

Backup Methods

1. Using CLI for Scheduled Backups

The vast-cli tool allows you to create scheduled backup jobs with several timing options. The basic structure of a scheduled backup command includes these parameters:
--schedule SCHEDULE      Values: HOURLY, DAILY, WEEKLY
--start_date START_DATE  Start date in format 'YYYY-MM-DD HH:MM:SS PM' (UTC) 
--end_date END_DATE      End date in format 'YYYY-MM-DD HH:MM:SS PM' (UTC) 
--day DAY                Day of week (0-6, where 0=Sunday) or "*"
--hour HOUR              Hour of day (0-23) or "*"
You can run this command to see more details about these parameters:
python3 vast.py cloud copy --help
Let’s explore the different scheduling options: To create a weekly backup job that runs every Saturday at 9 PM UTC:
python3 vast.py cloud copy --src /workspace --dst /backups/19015821_backups/ --instance 19015821 --connection 19447 --transfer "Instance To Cloud" --schedule WEEKLY --day 6 --hour 21
In this command:
  • —src /workspace specifies the source directory on your instance
  • —dst /backups/19015821_backups/ is the destination folder in your cloud storage
  • —instance 19015821 is your instance’s ID
  • —connection 19447 is your cloud storage connection ID
  • —day 6 represents Saturday (0=Sunday, 1=Monday, etc.)
  • —hour 21 represents 9 PM UTC (0=12am UTC, 1=1am UTC, etc.)
For daily backups at a specific hour (e.g., 9 PM UTC every day):
python3 vast.py cloud copy --src /workspace --dst /backups/19015821_backups/ --instance 19015821 --connection 19447 --transfer "Instance To Cloud" --schedule DAILY --day "*" --hour 21
The —day ”*” parameter indicates that the job should run every day. For hourly backups that run every hour of every day:
python3 vast.py cloud copy --src /workspace --dst /backups/19015821_backups/ --instance 19015821 --connection 19447 --transfer "Instance To Cloud" --schedule HOURLY --day "*" --hour "*"
Setting both —day ”*” and —hour ”*” along with —schedule HOURLY makes the job run every hour. To update your backup schedule, simply run the same command with the new schedule. The system will prompt you for confirmation, and upon acceptance, it will update the schedule accordingly.
Existing scheduled job found. Do you want to update it (y|n)? y
add_scheduled_job update: success - Scheduling DAILY job to cloud copy from 1745599087.0 to 1746599887.0

2. Using Cron on Your Personal Linux Computer

If you prefer more granular control over your backup schedule, you can use cron on your local Linux or Mac computer. This approach allows for customized schedules beyond the hourly/daily/weekly options. First, open your crontab file for editing:
crontab -e
Then, add a line that specifies your backup schedule. For example, to run a backup every 4 hours:
0 */4 * * * python3 vast.py cloud copy --src /workspace --dst /backups/19015821_backups/ --instance 19015821 --connection 19447 --transfer 'Instance To Cloud'
In this cron schedule:
  • 0 represents the minute (0th minute of the hour)
  • */4 means “every 4 hours”
  • The three asterisks * * * represent day of month, month, and day of week, indicating “every day”

Viewing Scheduled Backup Jobs

To see all your currently scheduled backup jobs:
python3 vast.py show scheduled-jobs
Example output:
Bash
Scheduled Job ID  Instance ID  API Endpoint              Start (Date/Time in UTC)  End (Date/Time in UTC)  Day of the Week  Hour of the Day in UTC  Minute of the Hour  Frequency
1                 19778412     /api/v0/commands/rclone/  2025-04-24/23:38          2028-05-06/23:38        Everyday         4_PM                    00                  DAILY
2                 19782577     /api/v0/commands/rclone/  2025-04-29/23:47          2025-05-09/23:47        Wednesday        10_AM                   00                  WEEKLY
3                 19757389     /api/v0/commands/rclone/  2025-05-01/00:04          2026-05-01/00:04        Everyday         Every_hour              00                  HOURLY   
Understanding the Output

Field

Description

Scheduled Job ID

Unique identifier for your job (needed for deletion)

Instance ID

The instance this job is associated with

API Endpoint

The endpoint being called (rclone is used for backups to cloud storage)

Start (Date/Time)

Start date/time of period when this scheduled job will be executed (in UTC)

End (Date/Time)

End date/time of period when this scheduled job will be executed (in UTC). Default is the end of the contract.

Day of the Week

Which day the job runs (can be specific day like “Wednesday”, “Saturday”, or “Everyday”)

Hour of the Day

At what hour the job runs (formatted as 1_PM, 11_PM, 8_PM in UTC, etc.)

Minute of the Hour

At what minute of the specified hour the job runs (00, 33, 10, etc.)

Frequency

How often the job runs (HOURLY, DAILY, WEEKLY)

Examples Explained:
  1. Job 1: A DAILY backup that runs every day at 4:00 PM UTC
    • Runs daily at the same time
    • Will continue running from Apr 24, 2025 until May 6, 2028
  2. Job 2: A WEEKLY backup that runs on Wednesdays at 10:00 AM UTC
    • Runs only on Wednesdays at 10:00 AM UTC
    • Short duration job (Apr 29 - May 9, 2025)
  3. Job 3: A HOURLY backup that runs every hour of every day
    • Runs every hour (1_AM, 2_AM, 3_AM, etc.)
    • Will continue running for a year

Deleting Scheduled Backup Jobs

If you need to remove a scheduled backup job that you no longer want to run, you can use the delete scheduled-job command followed by the job ID:
Bash
python3 vast.py delete scheduled-job JOB_ID
For example:
Bash
python3 vast.py delete scheduled-job 4462309
This will completely remove the scheduled job from the system. When successful, you’ll receive a confirmation message:
Bash
{'success': True, 'msg': 'Scheduled job 4462309 deleted successfully'}

Find Job IDs to Delete

To find the ID of the job you want to delete, first run:
Bash
python3 vast.py show scheduled-jobs
You’ll see output similar to:
Bash
Scheduled Job ID  Instance ID  API Endpoint              Start (Date/Time)  End (Date/Time)   Day of the Week  Hour of the Day  Minute of the Hour  Frequency
4462317           19281511     /api/v0/commands/rclone/  2025-04-08/09:01   2028-06-08/18:48  Everyday         1_PM             33                  DAILY    
4462321           19489711     /api/v0/commands/rclone/  2025-04-15/20:00   2025-04-19/20:00  Saturday         11_PM            00                  WEEKLY   
4462322           19490133     /api/v0/commands/rclone/  2025-04-15/20:00   2025-04-19/20:00  Wednesday        8_PM             10                  WEEKLY 
The scheduled_job_id column in the output contains the IDs you’ll need for deletion.

Best Practices

Choose the Right Backup Frequency

Consider these factors when determining how often to back up your data:
  • How frequently your data changes
  • The criticality of your data
  • The cost of data loss
  • The performance impact of backup operations
  • The bandwidth costs of backing your data up in cloud storage

Back Up Only What You Need

Be selective about what you back up to save time and storage costs:
  • Focus on backing up only important data (models, results, custom code)

Verify Your Backups

Periodically check that your backups are working correctly:
  • Download a sample backup from cloud storage and verify its contents
  • Check logs for any cloud copy failures
  • Test the restoration process before you actually need it
  • If contract is extended, update end_date of scheduled job

Conclusion

Setting up regular backups for your Vast.ai instances can be a valuable part of a robust workflow. By choosing the appropriate backup method and schedule, you can ensure that your valuable work remains safe and accessible regardless of instance lifecycle events. Remember that the best backup system is one that you set up before you need it. Take time now to implement a backup strategy that meets your needs, and you can thank yourself later.

Additional Resources

I