NVIDIA NIM logo How to Get a Free NVIDIA NIM API Key (2026)

61 free models available — no credit card required. Get your NVIDIA NIM API key → Test free models →

NVIDIA NIM FreeLLM Score

✅ 79/100 Solid Choice — Strong in great tool compatibility How we score →
🎁 Generosity 65 🌍 Access 65 📚 Breadth 95 ⚡ Reliability 60 🔌 Compat 100 🧠 Quality 90

All Free NVIDIA NIM Models — Context Windows & Rate Limits

Model Context Max Output Modality Rate Limit Released Status
minimaxai/minimax-m3 1.0M 512K textimagevideoreasoning Up to 40 RPM Jun 1, 2026 Online
deepseek-ai/deepseek-v4-pro 1.0M 384K textreasoning Up to 40 RPM Apr 24, 2026 Online
deepseek-ai/deepseek-v4-flash 1.0M 66K textreasoning Up to 40 RPM Apr 24, 2026 Online
moonshotai/kimi-k2.6 262K 262K textimagevideoreasoning Up to 40 RPM Apr 20, 2026 Online
z-ai/glm-5.1 203K 8K text Up to 40 RPM Apr 7, 2026 Online
minimaxai/minimax-m2.7 205K 197K textreasoning Up to 40 RPM Mar 18, 2026 Online
stepfun-ai/step-3.7-flash 256K 256K textimagereasoning Up to 40 RPM May 29, 2026 Online
qwen/qwen3.5-397b-a17b 256K 8K textimagevideoaudioreasoning Up to 40 RPM Feb 16, 2026 Online
qwen/qwen3.5-122b-a10b 262K 262K textimagevideoaudioreasoning Up to 40 RPM Feb 24, 2026 Online
stepfun-ai/step-3.5-flash 262K 66K textreasoning Up to 40 RPM Feb 2, 2026 Online
Nemotron 3 Nano Omni 30B A3B Reasoning 256K 66K visionaudioreasoning Apr 28, 2026 Online
mistralai/mistral-large-3-675b-instruct-2512 8K 4K Online
nvidia/nemotron-3.5-content-safety 128K 8K textimagereasoning Up to 40 RPM Jun 4, 2026 Online
nvidia/nemoretriever-parse 131K 8K rerank Up to 40 RPM Online
abacusai/dracarys-llama-3.1-70b-instruct 8K 4K Online
nvidia/llama-3.3-nemotron-super-49b-v1.5 131K 16K textreasoning Up to 40 RPM Jul 25, 2025 Online
mistralai/mistral-medium-3.5-128b 8K 4K Online
nvidia/llama-nemotron-embed-1b-v2 131K 8K embeddingtextimage Up to 40 RPM Feb 10, 2026 Online
nvidia/llama-nemotron-embed-vl-1b-v2 131K 8K embeddingtextimage Up to 40 RPM Feb 10, 2026 Online
Llama 3.3 Nemotron Super 49B v1 131K 131K reasoning Online
meta/llama-3.1-70b-instruct 131K 16K text Up to 40 RPM Jul 23, 2024 Online
Nemotron Mini 4B Instruct 128K 8K text Online
bytedance/seed-oss-36b-instruct 8K 4K Online
google/diffusiongemma-26b-a4b-it 8K 4K Online
google/gemma-2-2b-it 8K 4K Online
google/gemma-3n-e2b-it 8K 4K Online
meta/llama-3.2-90b-vision-instruct 8K 4K Online
meta/llama-4-maverick-17b-128e-instruct 8K 4K Online
mistralai/mistral-nemotron 8K 4K Online
mistralai/mistral-small-4-119b-2603 8K 4K Online
mistralai/mixtral-8x7b-instruct-v0.1 8K 4K Online
nvidia/gliner-pii 8K 4K Online
nvidia/ising-calibration-1-35b-a3b 8K 4K Online
nvidia/riva-translate-4b-instruct-v1.1 8K 4K Online
sarvamai/sarvam-m 8K 4K Online
stockmark/stockmark-2-100b-instruct 8K 4K Online
Nemotron Nano 12B v2 VL 128K 128K visionreasoning Oct 28, 2025 Online
meta/llama-3.2-11b-vision-instruct 131K 16K textimage Up to 40 RPM Sep 25, 2024 Online
microsoft/phi-4-mini-instruct 8K 4K Online
upstage/solar-10.7b-instruct 8K 4K Online
meta/llama-3.2-3b-instruct 131K 8K text Up to 40 RPM Sep 25, 2024 Online
meta/llama-3.2-1b-instruct 131K 60K text Up to 40 RPM Sep 25, 2024 Online
nvidia/embed-qa-4 131K 8K embedding Up to 40 RPM Online
nvidia/llama-3.2-nemoretriever-1b-vlm-embed-v1 131K 8K embeddingrerank Up to 40 RPM Online
nvidia/llama-3.2-nv-embedqa-1b-v1 131K 8K embedding Up to 40 RPM Online
nvidia/nv-embed-v1 131K 8K embedding Up to 40 RPM Online
nvidia/nv-embedcode-7b-v1 131K 8K embedding Up to 40 RPM Online
nvidia/nv-embedqa-e5-v5 131K 8K embedding Up to 40 RPM Online
nvidia/nv-embedqa-mistral-7b-v2 131K 8K embedding Up to 40 RPM Online
snowflake/arctic-embed-l 131K 8K embedding Up to 40 RPM Online
mistralai/ministral-14b-instruct-2512 8K 4K Dec 2, 2025 Online
Llama-3.3-70B-Instruct 128K 4K text Dec 6, 2024 Online
meta/llama-guard-4-12b 164K 16K textimage Up to 40 RPM Apr 30, 2025 Online
nvidia/llama-3.1-nemotron-nano-vl-8b-v1 8K 4K Online
nvidia/nvidia-nemotron-nano-9b-v2 8K 4K Online
Llama 3.1 Nemotron Safety Guard 8B v3 128K 4K text Online
Nemotron 3 Content Safety 128K 4K text Online
Nemotron Content Safety Reasoning 4B 128K 4K reasoning Online
meta/llama-3.1-8b-instruct 8K 4K Jul 23, 2024 Online
nvidia/llama-3.1-nemoguard-8b-content-safety 8K 4K Online
nvidia/llama-3.1-nemoguard-8b-topic-control 8K 4K Online

What is NVIDIA NIM?

100+ open models from NVIDIA — no credit card, 40 RPM.

NVIDIA NIM (NVIDIA Inference Microservices) provides API access to 100+ open-weight models hosted on NVIDIA infrastructure. The free tier is available to all NVIDIA Developer Program members (free sign-up) with a limit of ~40 requests/minute. Models include Llama, Mistral, DeepSeek-R1, Nemotron, and domain-specific variants. All endpoints are OpenAI-compatible.

  • 100+ open models available
  • No daily token cap
  • ~40 RPM free tier
  • No credit card required

API Compatibility: OpenAI SDK-compatible (Chat Completions)

How to Get a NVIDIA NIM API Key

  1. 1
    Sign up at build.nvidia.com Free NVIDIA Developer account. No credit card.
  2. 2
    Go to Settings → API Keys
  3. 3
    Generate an API key
  4. 4
    Browse available models 100+ open models. Nemotron Super 49B recommended.
  5. 5
    Configure OpenAI client Base URL: https://integrate.api.nvidia.com/v1

NVIDIA NIM Free Tier Limits & Pricing

Credit Card Not required
Free Tier Permanently free
Context Range 8K – 1.0M
Total Models 61 free
Rate Limits Up to 40 RPM
API Compatibility OpenAI SDK-compatible (Chat Completions)

NVIDIA NIM API Setup Tutorial & Tools

NVIDIA NIM is fully compatible with popular AI coding assistants like Cursor, Claude Code, and more. To see step-by-step API configuration instructions for your favorite tool, please visit our Global Configuration Guide →

Use Cases

What NVIDIA NIM's free models are best for, based on aggregated model capabilities:

Chat 61 models Embedding 10 models Reasoning 4 models Vision 1 model Coding 1 model

Limitations & Caveats

  • ~40 RPM shared across all models, not per-model
  • Some models require additional registration per model family
  • Unavailable models listed in catalog but uncallable with standard key

Frequently Asked Questions

Why can't I call certain models on NVIDIA NIM even though they're listed?

NVIDIA NIM's catalog includes all models, but some require additional per-model-family registration. If you get a 403 error, go to the model's page and click "Try API" to register for that specific model family.

Is the 40 RPM limit shared across all models?

Yes — NVIDIA NIM applies a global ~40 RPM limit to your API key, shared across all model calls. If you're using multiple models in parallel, the combined rate cannot exceed ~40 RPM.

Does NVIDIA NIM require phone verification?

Yes, NVIDIA Developer account signup requires phone number verification. This is a one-time step during account creation.

See our FAQ for common questions about free LLM APIs