How to Get a Free NVIDIA NIM API Key (2026)

61 free models available — no credit card required. Get your NVIDIA NIM API key → Test free models →

NVIDIA NIM FreeLLM Score

✅ 79/100 Solid Choice — Strong in great tool compatibility How we score →

🎁 Generosity 65 🌍 Access 65 📚 Breadth 95 ⚡ Reliability 60 🔌 Compat 100 🧠 Quality 90

All Free NVIDIA NIM Models — Context Windows & Rate Limits

Model	Context	Max Output	Modality	Rate Limit	Released	Status
minimaxai/minimax-m3	1.0M	512K	textimagevideoreasoning	Up to 40 RPM	Jun 1, 2026	Online
deepseek-ai/deepseek-v4-pro	1.0M	384K	textreasoning	Up to 40 RPM	Apr 24, 2026	Online
deepseek-ai/deepseek-v4-flash	1.0M	66K	textreasoning	Up to 40 RPM	Apr 24, 2026	Online
moonshotai/kimi-k2.6	262K	262K	textimagevideoreasoning	Up to 40 RPM	Apr 20, 2026	Online
z-ai/glm-5.1	203K	8K	text	Up to 40 RPM	Apr 7, 2026	Online
minimaxai/minimax-m2.7	205K	197K	textreasoning	Up to 40 RPM	Mar 18, 2026	Online
stepfun-ai/step-3.7-flash	256K	256K	textimagereasoning	Up to 40 RPM	May 29, 2026	Online
qwen/qwen3.5-397b-a17b	256K	8K	textimagevideoaudioreasoning	Up to 40 RPM	Feb 16, 2026	Online
qwen/qwen3.5-122b-a10b	262K	262K	textimagevideoaudioreasoning	Up to 40 RPM	Feb 24, 2026	Online
stepfun-ai/step-3.5-flash	262K	66K	textreasoning	Up to 40 RPM	Feb 2, 2026	Online
Nemotron 3 Nano Omni 30B A3B Reasoning	256K	66K	visionaudioreasoning		Apr 28, 2026	Online
mistralai/mistral-large-3-675b-instruct-2512	8K	4K			—	Online
nvidia/nemotron-3.5-content-safety	128K	8K	textimagereasoning	Up to 40 RPM	Jun 4, 2026	Online
nvidia/nemoretriever-parse	131K	8K	rerank	Up to 40 RPM	—	Online
abacusai/dracarys-llama-3.1-70b-instruct	8K	4K			—	Online
nvidia/llama-3.3-nemotron-super-49b-v1.5	131K	16K	textreasoning	Up to 40 RPM	Jul 25, 2025	Online
mistralai/mistral-medium-3.5-128b	8K	4K			—	Online
nvidia/llama-nemotron-embed-1b-v2	131K	8K	embeddingtextimage	Up to 40 RPM	Feb 10, 2026	Online
nvidia/llama-nemotron-embed-vl-1b-v2	131K	8K	embeddingtextimage	Up to 40 RPM	Feb 10, 2026	Online
Llama 3.3 Nemotron Super 49B v1	131K	131K	reasoning		—	Online
meta/llama-3.1-70b-instruct	131K	16K	text	Up to 40 RPM	Jul 23, 2024	Online
Nemotron Mini 4B Instruct	128K	8K	text		—	Online
bytedance/seed-oss-36b-instruct	8K	4K			—	Online
google/diffusiongemma-26b-a4b-it	8K	4K			—	Online
google/gemma-2-2b-it	8K	4K			—	Online
google/gemma-3n-e2b-it	8K	4K			—	Online
meta/llama-3.2-90b-vision-instruct	8K	4K			—	Online
meta/llama-4-maverick-17b-128e-instruct	8K	4K			—	Online
mistralai/mistral-nemotron	8K	4K			—	Online
mistralai/mistral-small-4-119b-2603	8K	4K			—	Online
mistralai/mixtral-8x7b-instruct-v0.1	8K	4K			—	Online
nvidia/gliner-pii	8K	4K			—	Online
nvidia/ising-calibration-1-35b-a3b	8K	4K			—	Online
nvidia/riva-translate-4b-instruct-v1.1	8K	4K			—	Online
sarvamai/sarvam-m	8K	4K			—	Online
stockmark/stockmark-2-100b-instruct	8K	4K			—	Online
Nemotron Nano 12B v2 VL	128K	128K	visionreasoning		Oct 28, 2025	Online
meta/llama-3.2-11b-vision-instruct	131K	16K	textimage	Up to 40 RPM	Sep 25, 2024	Online
microsoft/phi-4-mini-instruct	8K	4K			—	Online
upstage/solar-10.7b-instruct	8K	4K			—	Online
meta/llama-3.2-3b-instruct	131K	8K	text	Up to 40 RPM	Sep 25, 2024	Online
meta/llama-3.2-1b-instruct	131K	60K	text	Up to 40 RPM	Sep 25, 2024	Online
nvidia/embed-qa-4	131K	8K	embedding	Up to 40 RPM	—	Online
nvidia/llama-3.2-nemoretriever-1b-vlm-embed-v1	131K	8K	embeddingrerank	Up to 40 RPM	—	Online
nvidia/llama-3.2-nv-embedqa-1b-v1	131K	8K	embedding	Up to 40 RPM	—	Online
nvidia/nv-embed-v1	131K	8K	embedding	Up to 40 RPM	—	Online
nvidia/nv-embedcode-7b-v1	131K	8K	embedding	Up to 40 RPM	—	Online
nvidia/nv-embedqa-e5-v5	131K	8K	embedding	Up to 40 RPM	—	Online
nvidia/nv-embedqa-mistral-7b-v2	131K	8K	embedding	Up to 40 RPM	—	Online
snowflake/arctic-embed-l	131K	8K	embedding	Up to 40 RPM	—	Online
mistralai/ministral-14b-instruct-2512	8K	4K			Dec 2, 2025	Online
Llama-3.3-70B-Instruct	128K	4K	text		Dec 6, 2024	Online
meta/llama-guard-4-12b	164K	16K	textimage	Up to 40 RPM	Apr 30, 2025	Online
nvidia/llama-3.1-nemotron-nano-vl-8b-v1	8K	4K			—	Online
nvidia/nvidia-nemotron-nano-9b-v2	8K	4K			—	Online
Llama 3.1 Nemotron Safety Guard 8B v3	128K	4K	text		—	Online
Nemotron 3 Content Safety	128K	4K	text		—	Online
Nemotron Content Safety Reasoning 4B	128K	4K	reasoning		—	Online
meta/llama-3.1-8b-instruct	8K	4K			Jul 23, 2024	Online
nvidia/llama-3.1-nemoguard-8b-content-safety	8K	4K			—	Online
nvidia/llama-3.1-nemoguard-8b-topic-control	8K	4K			—	Online

What is NVIDIA NIM?

100+ open models from NVIDIA — no credit card, 40 RPM.

NVIDIA NIM (NVIDIA Inference Microservices) provides API access to 100+ open-weight models hosted on NVIDIA infrastructure. The free tier is available to all NVIDIA Developer Program members (free sign-up) with a limit of ~40 requests/minute. Models include Llama, Mistral, DeepSeek-R1, Nemotron, and domain-specific variants. All endpoints are OpenAI-compatible.

100+ open models available
No daily token cap
~40 RPM free tier
No credit card required

API Compatibility: OpenAI SDK-compatible (Chat Completions)

How to Get a NVIDIA NIM API Key

1
Sign up at build.nvidia.com Free NVIDIA Developer account. No credit card.
Go to get a NVIDIA NIM free API key →
2
Go to Settings → API Keys
3
Generate an API key
4
Browse available models 100+ open models. Nemotron Super 49B recommended.
5
Configure OpenAI client Base URL: https://integrate.api.nvidia.com/v1

NVIDIA NIM Free Tier Limits & Pricing

Credit Card Not required

Free Tier Permanently free

Context Range 8K – 1.0M

Total Models 61 free

Rate Limits Up to 40 RPM

API Compatibility OpenAI SDK-compatible (Chat Completions)

NVIDIA NIM API Setup Tutorial & Tools

NVIDIA NIM is fully compatible with popular AI coding assistants like Cursor, Claude Code, and more. To see step-by-step API configuration instructions for your favorite tool, please visit our Global Configuration Guide →

Use Cases

What NVIDIA NIM's free models are best for, based on aggregated model capabilities:

Chat 61 models Embedding 10 models Reasoning 4 models Vision 1 model Coding 1 model

Limitations & Caveats

~40 RPM shared across all models, not per-model
Some models require additional registration per model family
Unavailable models listed in catalog but uncallable with standard key

Frequently Asked Questions

Why can't I call certain models on NVIDIA NIM even though they're listed?

NVIDIA NIM's catalog includes all models, but some require additional per-model-family registration. If you get a 403 error, go to the model's page and click "Try API" to register for that specific model family.

Is the 40 RPM limit shared across all models?

Yes — NVIDIA NIM applies a global ~40 RPM limit to your API key, shared across all model calls. If you're using multiple models in parallel, the combined rate cannot exceed ~40 RPM.

Does NVIDIA NIM require phone verification?

Yes, NVIDIA Developer account signup requires phone number verification. This is a one-time step during account creation.

How to Get a Free NVIDIA NIM API Key (2026)

NVIDIA NIM FreeLLM Score

All Free NVIDIA NIM Models — Context Windows & Rate Limits

What is NVIDIA NIM?

How to Get a NVIDIA NIM API Key

NVIDIA NIM Free Tier Limits & Pricing

NVIDIA NIM API Setup Tutorial & Tools

Use Cases

Limitations & Caveats

Frequently Asked Questions

Why can't I call certain models on NVIDIA NIM even though they're listed?

Is the 40 RPM limit shared across all models?

Does NVIDIA NIM require phone verification?

Other Free LLM API Providers

OpenRouter

Chutes.ai

Glhf.chat

Grok (xAI)

Groq

GitHub Models