NVIDIA NIM — Free LLM API
12 free models available — no credit card required. Get API key →
100+ open models from NVIDIA — no credit card, 40 RPM.
NVIDIA NIM (NVIDIA Inference Microservices) provides API access to 100+ open-weight models hosted on NVIDIA infrastructure. The free tier is available to all NVIDIA Developer Program members (free sign-up) with a limit of ~40 requests/minute. Models include Llama, Mistral, DeepSeek-R1, Nemotron, and domain-specific variants. All endpoints are OpenAI-compatible.
- 100+ open models available
- No daily token cap
- ~40 RPM free tier
- No credit card required
API Compatibility: OpenAI SDK-compatible (Chat Completions)
All Free NVIDIA NIM Models — Context Windows & Rate Limits
| Model | Context | Max Output | Modality | Rate Limit | Status | |
|---|---|---|---|---|---|---|
| Various open models | 131K | 8K | See provider page | Details | ||
| deepseek-ai/deepseek-r1 | 128K | 163K | ~40 RPM | Details | ||
| nvidia/llama-3.1-nemotron-ultra-253b-v1 | 128K | 4K | ~40 RPM | Details | ||
| nvidia/nemotron-3-super-120b-a12b | 262K | 262K | ~40 RPM | Details | ||
| nvidia/nemotron-3-nano-30b-a3b | 128K | 32K | ~40 RPM | Details | ||
| meta/llama-3.1-405b-instruct | 128K | 4K | ~40 RPM | Details | ||
| qwen/qwen2.5-72b-instruct | 128K | 8K | ~40 RPM | Details | ||
| google/gemma-4-31b | 128K | 8K | ~40 RPM | Details | ||
| mistralai/mistral-large-2-instruct | 128K | 4K | ~40 RPM | Details | ||
| nvidia/nemotron-nano-2-vl | 128K | 8K | ~40 RPM | Details | ||
| minimax/minimax-m2.7 | 128K | 8K | ~40 RPM | Details | ||
| + 90 more models | 131K | 131K | ~40 RPM | Details |
Frequently Asked Questions about NVIDIA NIM Free API
Is NVIDIA NIM free to use?
NVIDIA NIM offers a permanently free tier with 12 available models. No credit card is required to get started — just sign up and generate an API key.
What models does NVIDIA NIM offer for free?
NVIDIA NIM provides 12 free models covering chat, reasoning use cases. Supported modalities include text, image. Browse the full list above with context windows and rate limits.
How do I use NVIDIA NIM with Claude Code or Cursor?
Click "Details" on any model above to get one-click configuration snippets for Claude Code (cc), Cursor, Codex, and more.
All NVIDIA NIM models listed here use an OpenAI-compatible endpoint, so any tool that accepts a custom baseURL will work.