GPU Rental Costs Vast.ai

Live on-demand GPU pricing for renting your own image/video inference box. Use the estimator to predict a monthly bill before you spin anything up.

How Vast.ai billing works

Time-based, per-second. You pay a fixed $/hr for the rented GPU the whole time it is running. The number of generations doesn't change cost — only wall-clock runtime does.
Stopped instances still cost a small disk fee ($/GB-month) until you destroy them. Bandwidth is metered but minor.
Destroy to stop all charges. Stopping pauses compute billing; destroying releases the disk too.
Contrast: hosted APIs (Higgsfield / fal) bill per generated asset; Vast bills per minute of GPU time regardless of output count.

Account balance

…

One-click backend

fal.aiFAL_KEY not set

Primary generation backend. Add FAL_KEY (id:secret) from fal.ai to enable real generation.

GPU backend control

No Vast instances yet. Click Launch L40S backend to rent one and auto-provision it.

Manual provision command

# On the Vast L40S instance (Docker-enabled template):
git clone <your-repo> ai-generator && cd ai-generator
export HF_TOKEN=hf_xxx            # gated weights
bash scripts/vast-provision.sh --with-sdxl
# → paste the printed env block into the app's .env.local, redeploy.

# Idle cost control (from the app host), cron every 5 min:
VAST_DRY_RUN=false pnpm vast:reaper

Live pricing

Loading live offers from Vast.ai…

Session cost estimator

GPU

Hours per sessionSessions per month

Select a GPU to estimate costs.

Estimates only — actual cost depends on the specific host you pick, runtime, storage, and bandwidth. Remember to destroy the instance when done; idle running time is still billed.

How to rent & wire it up

1. On console.vast.ai, rent an on-demand instance with a PyTorch or ComfyUI template and enough VRAM for your model.
2. Expose the service port, then point this app's provider URLs at the instance IP — e.g. set COMFYUI_URL, LTX_URL, or A1111_URL in your environment.
3. When you're finished, destroy the instance in the Vast console to stop both compute and storage charges.