Running GLiNER2 on Vast.ai
Why GLiNER2?
Named Entity Recognition (NER) extracts structured data from text—people, companies, dates, etc. Traditional NER models only recognize entity types they were trained on. LLMs can extract anything but are slow and expensive. GLiNER2 embeds both text and entity labels into the same vector space, scoring text spans against each label. This lets you define custom entity types at inference time—no retraining needed. It also handles text classification, structured extraction, and relation extraction, all in a 205M parameter model that runs on CPU or GPU.What This Guide Covers
- Quick Start - Deploy our pre-built Docker image in minutes
- Full Tutorial - Learn how to create your own Docker images for Vast.ai
Prerequisites
Before getting started, you’ll need:- A Vast.ai account with credits (Sign up here)
- Vast.ai CLI installed (
pip install vastai) - Docker installed locally (for building custom images)
Note: Get your API key from the Vast.ai account page and set it with vastai set api-key <your-vast-api-key>.
Quick Start: Using the Pre-built Image
The fastest way to deploy GLiNER2 is with our pre-built Docker image.Step 1: Find a GPU Instance
Step 2: Deploy the Image
Note: Vast.ai overrides Docker’sCMDandENTRYPOINT, so you must use--onstart-cmdto start the server.
Step 3: Get Your Endpoint
http://<IP>:<PORT>.
Step 4: Test the API
Tutorial: Creating Docker Images for Vast.ai
Want to build your own Docker images for Vast.ai? This section walks you through the process using GLiNER2 as an example.Understanding Vast.ai’s Docker Behavior
Vast.ai handles Docker containers differently than standard Docker:- CMD and ENTRYPOINT are overridden - Vast.ai replaces your container’s entrypoint with its own initialization scripts that set up SSH, Jupyter, and other services.
-
Use
--onstart-cmdinstead - To run your application, pass the startup command via--onstart-cmdwhen creating the instance. -
Environment variables - Pass environment variables using the
--envflag.
CMD for local testing, but users deploying to Vast.ai will need to specify --onstart-cmd.
Project Structure
Create a new directory with these files:Step 1: Create requirements.txt
Step 2: Create the FastAPI Server
Python
Step 3: Create the Dockerfile
- Use a PyTorch base image with CUDA support
- Install dependencies in a separate layer for caching
- Include a health check for monitoring
- Add a comment reminding users about
--onstart-cmd
Step 4: Build and Test Locally
Step 5: Publish and Deploy to Vast.ai
Publish your image to a container registry (Docker Hub, GitHub Container Registry, etc.), then deploy it:API Reference
GET /health
Returns server status and GPU information. Response:POST /extract
Extract entities from text. Headers:Authorization: Bearer gliner-api-key(required)
Python Client Example
Python
Cleanup
Don’t forget to destroy your instance when done:Additional Resources
Conclusion
You’ve learned how to deploy GLiNER2 on Vast.ai using our pre-built image, and how to create your own Docker images that work with Vast.ai’s container system. The key takeaway: always use--onstart-cmd to start your application since Vast.ai overrides Docker’s CMD and ENTRYPOINT.
Ready to get started? Sign up for Vast.ai and deploy your first GLiNER2 instance today.