Overview
Ralph is an agentic loop that implements a project from a PRD. It picks a user story, writes the code, runs tests, and moves to the next story — repeating until everything passes. By running on Vast.ai with an open-source model, you get autonomous development without API costs. In this guide, we’ll start with a simple calculator example to see Ralph in action. Once that works, you can scale up to complex projects that run overnight.Model: Qwen3-Coder-Next-FP8
| Attribute | Value |
|---|---|
| Model | Qwen/Qwen3-Coder-Next-FP8 |
| Size | 80B params (3B active, MoE), ~80GB in FP8 |
| GPUs | 4x RTX 4090 (96GB total) |
| Cost | ~$1.50/hr |
| Image | lmsysorg/sglang:latest (v0.5.8+) |
| CUDA | 12.9+ |
- Trained specifically for agentic coding tools (aider, Claude Code, Cline, etc.)
- 256K context length
Prerequisites
- Vast.ai account with API key (Sign up here)
- Python 3.10 or later
git,jq,curl,openssl
Setup
Bash
Step 1: Deploy Qwen3-Coder-Next on Vast
Find a 4x RTX 4090 instance with CUDA 12.9+:Bash
<OFFER_ID> with an ID from the first column):
Bash
This guide uses three different keys:
- Vast account API key — authenticates the Vast CLI (
vastai set api-key) - Endpoint bearer token (
MODEL_API_KEY) — secures your SGLang inference endpoint - Client SDK key (
OPENAI_API_KEY) — set to the same value as the endpoint bearer token so Aider’s OpenAI-compatible client can authenticate
Step 2: Get Your Endpoint
Wait 10-15 minutes for the model weights (~80GB) to download and load. You can monitor progress withvastai logs <INSTANCE_ID> — look for “The server is fired up and ready to roll!” Then get your endpoint:
Bash
Bash
Step 3: Configure Aider for Vast
Set environment variables to point Aider at your Vast endpoint.OPENAI_API_KEY must be set to the same endpoint bearer token you generated in Step 1:
Bash
Step 4: Verify Aider Connectivity
Test that Aider can reach your Vast endpoint:Bash
vastai logs <INSTANCE_ID>).
Step 5: Add Aider Support to Ralph
Ralph doesn’t include aider as a tool out of the box. You need to make two edits toralph.sh:
Edit 1: Add aider to the tool validation. Find the line that validates the --tool argument:
Bash
aider as a valid option:
Bash
elif chain that runs each tool (look for the claude block). After the last elif block and before the closing fi, add:
Bash
OPENAI_API_BASE env var you set in Step 3), load the PRD file for context, and run non-interactively with the Ralph prompt.
Step 6: Run Ralph
Create aprd.json that defines what you want Ralph to build. Note that testCommand is informational — the agent reads it from the PRD to know how to run tests, but ralph.sh itself doesn’t execute it.
JSON
Bash
calculator.py and test_calculator.py from scratch, implementing each user story and running tests until they pass.
Example output (calculator.py):
Python
test_calculator.py):
Python
Cleanup
Destroy the instance when done:Bash
Next Steps: Overnight Ralph Loop
Project ideas for overnight runs:- Full CLI application with subcommands, config files, and help system
- REST API with authentication, validation, and multiple resource types
- Web scraper with multiple site adapters, rate limiting, and data export
- Complete test suite for an existing codebase (one test file per module)
- Database migration system with schema versioning and rollback
Bash
- Use
tmuxorscreeninstead ofnohupif you want to reattach later - Monitor with
vastai show instance <ID>to ensure the instance stays running - Check
progress.txtfor Ralph’s learnings across iterations - Commit your
prd.jsonbefore starting so you can reset if needed - Remember to
vastai destroy instance <INSTANCE_ID>when the run finishes — instances bill by the hour even when idle