Budget-Friendly Alternative to Claude Code - Overnight Ralph Loop Guide - Vast.ai Documentation

Run autonomous coding agents all night long for under $18 and wake up to a completed project with passing tests.

Overview

Ralph is an agentic loop that implements a project from a PRD. It picks a user story, writes the code, runs tests, and moves to the next story — repeating until everything passes. By running on Vast.ai with an open-source model, you get autonomous development without API costs. In this guide, we’ll start with a simple calculator example to see Ralph in action. Once that works, you can scale up to complex projects that run overnight.

Model: Qwen3-Coder-Next-FP8

Attribute	Value
Model	`Qwen/Qwen3-Coder-Next-FP8`
Size	80B params (3B active, MoE), ~80GB in FP8
GPUs	4x RTX 4090 (96GB total)
Cost	~$1.50/hr
Image	`lmsysorg/sglang:latest` (v0.5.8+)
CUDA	12.9+

Why Qwen3-Coder-Next?

Trained specifically for agentic coding tools (aider, Claude Code, Cline, etc.)
256K context length

Prerequisites

Vast.ai account with API key (Sign up here)
Python 3.10 or later
git, jq, curl, openssl

Setup

Bash

# Create a virtual environment
python3 -m venv ralph-env
source ralph-env/bin/activate

# Install Vast CLI and Aider
pip install --upgrade vastai aider-chat pytest
vastai set api-key <your-vast-api-key>

# Clone Ralph
git clone https://github.com/snarktank/ralph.git
cd ralph

Step 1: Deploy Qwen3-Coder-Next on Vast

Find a 4x RTX 4090 instance with CUDA 12.9+:

Bash

vastai search offers 'gpu_name=RTX_4090 num_gpus=4 dph<2.5 reliability>0.98 inet_down>1000 cuda_vers>=12.9 direct_port_count>=1' -o 'dph'

Generate a bearer token for the inference endpoint and deploy (replace <OFFER_ID> with an ID from the first column):

Bash

# Generate a bearer token for your inference endpoint
MODEL_API_KEY=$(openssl rand -hex 16)
echo "$MODEL_API_KEY" > .vast_model_api_key
echo "Endpoint bearer token: $MODEL_API_KEY (saved to .vast_model_api_key)"

# Deploy with SGLang
vastai create instance <OFFER_ID> \
    --image lmsysorg/sglang:latest \
    --env '-p 8000:8000' \
    --disk 200 \
    --onstart-cmd "python3 -m sglang.launch_server \
        --model-path Qwen/Qwen3-Coder-Next-FP8 \
        --host 0.0.0.0 \
        --port 8000 \
        --tp-size 4 \
        --context-length 32768 \
        --mem-fraction-static 0.85 \
        --api-key $MODEL_API_KEY"

This guide uses three different keys:

Vast account API key — authenticates the Vast CLI (vastai set api-key)
Endpoint bearer token (MODEL_API_KEY) — secures your SGLang inference endpoint
Client SDK key (OPENAI_API_KEY) — set to the same value as the endpoint bearer token so Aider’s OpenAI-compatible client can authenticate

Step 2: Get Your Endpoint

Wait 10-15 minutes for the model weights (~80GB) to download and load. You can monitor progress with vastai logs <INSTANCE_ID> — look for “The server is fired up and ready to roll!” Then get your endpoint:

Bash

vastai show instance <INSTANCE_ID> --raw | jq -r '"\(.public_ipaddr):\(.ports["8000/tcp"][0].HostPort)"'
# Output: <IP>:<PORT>

Verify it’s ready (SGLang returns HTTP 200 with an empty body — that’s normal):

Bash

curl -w '\nHTTP Status: %{http_code}\n' -H "Authorization: Bearer $MODEL_API_KEY" http://<IP>:<PORT>/health

Step 3: Configure Aider for Vast

Set environment variables to point Aider at your Vast endpoint. OPENAI_API_KEY must be set to the same endpoint bearer token you generated in Step 1:

Bash

export OPENAI_API_BASE="http://<IP>:<PORT>/v1"
export OPENAI_API_KEY="$MODEL_API_KEY"

Step 4: Verify Aider Connectivity

Test that Aider can reach your Vast endpoint:

Bash

aider --model openai/Qwen/Qwen3-Coder-Next-FP8 --no-git --yes-always --no-show-model-warnings --message "Say hello"

You should see Aider respond. If you get connection errors, verify the endpoint URL and that the model finished loading (check vastai logs <INSTANCE_ID>).

Step 5: Add Aider Support to Ralph

Ralph doesn’t include aider as a tool out of the box. You need to make two edits to ralph.sh: Edit 1: Add aider to the tool validation. Find the line that validates the --tool argument:

Bash

if [[ "$TOOL" != "amp" && "$TOOL" != "claude" ]]; then
  echo "Error: Invalid tool '$TOOL'. Must be 'amp' or 'claude'."

Add aider as a valid option:

Bash

if [[ "$TOOL" != "amp" && "$TOOL" != "claude" && "$TOOL" != "aider" ]]; then
  echo "Error: Invalid tool '$TOOL'. Must be 'amp', 'claude', or 'aider'."

Edit 2: Add the aider tool block. Find the elif chain that runs each tool (look for the claude block). After the last elif block and before the closing fi, add:

Bash

  elif [[ "$TOOL" == "aider" ]]; then
    # Aider: use --message flag for non-interactive mode
    PROMPT_CONTENT=$(cat "$SCRIPT_DIR/prompt.md")
    OUTPUT=$(aider --model openai/Qwen/Qwen3-Coder-Next-FP8 \
      --yes-always --no-git --no-show-model-warnings --no-browser \
      --file "$SCRIPT_DIR/prd.json" \
      --message "$PROMPT_CONTENT" 2>&1) || true
    echo "$OUTPUT"

This tells aider to use the Vast-hosted Qwen3-Coder-Next model (via the OPENAI_API_BASE env var you set in Step 3), load the PRD file for context, and run non-interactively with the Ralph prompt.

Step 6: Run Ralph

Create a prd.json that defines what you want Ralph to build. Note that testCommand is informational — the agent reads it from the PRD to know how to run tests, but ralph.sh itself doesn’t execute it.

JSON

{
  "project": "Calculator",
  "branchName": "ralph/calculator",
  "description": "Create a Python calculator module with basic arithmetic functions",
  "testCommand": "python -m pytest test_calculator.py -v",
  "userStories": [
    {
      "id": "US-001",
      "title": "Create add function",
      "description": "Create calculator.py with an add function.",
      "acceptanceCriteria": [
        "add(2, 3) returns 5",
        "add(-1, 1) returns 0"
      ],
      "priority": 1,
      "passes": false
    },
    {
      "id": "US-002",
      "title": "Create multiply function",
      "description": "Add a multiply function to calculator.py.",
      "acceptanceCriteria": [
        "multiply(2, 3) returns 6",
        "multiply(-1, 5) returns -5",
        "multiply(0, 100) returns 0"
      ],
      "priority": 2,
      "passes": false
    },
    {
      "id": "US-003",
      "title": "Create divide function",
      "description": "Add a divide function with zero handling.",
      "acceptanceCriteria": [
        "divide(10, 2) returns 5",
        "divide(-6, 3) returns -2",
        "divide(1, 0) raises ZeroDivisionError"
      ],
      "priority": 3,
      "passes": false
    }
  ]
}

Run Ralph:

Bash

OPENAI_API_BASE="http://<IP>:<PORT>/v1" \
OPENAI_API_KEY="$MODEL_API_KEY" \
./ralph.sh --tool aider 5

Ralph creates calculator.py and test_calculator.py from scratch, implementing each user story and running tests until they pass. Example output (calculator.py):

Python

def add(a, b):
    return a + b

def multiply(a, b):
    return a * b

def divide(a, b):
    if b == 0:
        raise ZeroDivisionError("Cannot divide by zero")
    return a / b

Example output (test_calculator.py):

Python

import pytest
from calculator import add, multiply, divide

def test_add():
    assert add(2, 3) == 5
    assert add(-1, 1) == 0

def test_multiply():
    assert multiply(2, 3) == 6
    assert multiply(-1, 5) == -5
    assert multiply(0, 100) == 0

def test_divide():
    assert divide(10, 2) == 5
    assert divide(-6, 3) == -2
    with pytest.raises(ZeroDivisionError):
        divide(1, 0)

Cleanup

Destroy the instance when done:

Bash

vastai destroy instance <INSTANCE_ID>

Next Steps: Overnight Ralph Loop

Project ideas for overnight runs:

Full CLI application with subcommands, config files, and help system
REST API with authentication, validation, and multiple resource types
Web scraper with multiple site adapters, rate limiting, and data export
Complete test suite for an existing codebase (one test file per module)
Database migration system with schema versioning and rollback

To run Ralph unattended overnight:

Bash

# Run in background with nohup, increase iterations
nohup bash -c 'OPENAI_API_BASE="http://<IP>:<PORT>/v1" OPENAI_API_KEY="$MODEL_API_KEY" ./ralph.sh --tool aider 500' > ralph.log 2>&1 &

# Check progress
tail -f ralph.log

# Check generated files
ls -la *.py

# Run tests manually
python -m pytest test_*.py -v

Cost estimate: At ~

1.50/hr, an 8-12 hour overnight run costs

12-18. Tips:

Use tmux or screen instead of nohup if you want to reattach later
Monitor with vastai show instance <ID> to ensure the instance stays running
Check progress.txt for Ralph’s learnings across iterations
Commit your prd.json before starting so you can reset if needed
Remember to vastai destroy instance <INSTANCE_ID> when the run finishes — instances bill by the hour even when idle

AI/ML Frameworks

Serving Infrastructure

AI Agents

MCP

Text Generation

Image Generation

Video Generation

Audio Generation

Transcription

OCR

Embeddings

NER

Virtual Computing

Graphics Rendering

GPU Programming

Distributed Computing

Development Tools

Specific GPUs

Budget-Friendly Alternative to Claude Code - Overnight Ralph Loop Guide

Overview

Model: Qwen3-Coder-Next-FP8

Prerequisites

Setup

Step 1: Deploy Qwen3-Coder-Next on Vast

Step 2: Get Your Endpoint

Step 3: Configure Aider for Vast

Step 4: Verify Aider Connectivity

Step 5: Add Aider Support to Ralph

Step 6: Run Ralph

Cleanup

Next Steps: Overnight Ralph Loop

Resources

AI/ML Frameworks

Serving Infrastructure

AI Agents

MCP

Text Generation

Image Generation

Video Generation

Audio Generation

Transcription

OCR

Embeddings

NER

Virtual Computing

Graphics Rendering

GPU Programming

Distributed Computing

Development Tools

Specific GPUs

​Overview

​Model: Qwen3-Coder-Next-FP8

​Prerequisites

​Setup

​Step 1: Deploy Qwen3-Coder-Next on Vast

​Step 2: Get Your Endpoint

​Step 3: Configure Aider for Vast

​Step 4: Verify Aider Connectivity

​Step 5: Add Aider Support to Ralph

​Step 6: Run Ralph

​Cleanup

​Next Steps: Overnight Ralph Loop

​Resources

Overview

Model: Qwen3-Coder-Next-FP8

Prerequisites

Setup

Step 1: Deploy Qwen3-Coder-Next on Vast

Step 2: Get Your Endpoint

Step 3: Configure Aider for Vast

Step 4: Verify Aider Connectivity

Step 5: Add Aider Support to Ralph

Step 6: Run Ralph

Cleanup

Next Steps: Overnight Ralph Loop

Resources