Best Free LLM APIs for Chat

398 free models available for chat. How to choose a free LLM for chat →

Coding Chat Vision Audio Reasoning Embedding

For general conversation, look for low latency, strong instruction following, and a helpful personality. Gemini 2.5 Flash offers the largest free context window (1M tokens) with multimodal support. Llama 3.3 70B via Groq delivers the fastest tokens-per-second. Qwen3.5 models on NVIDIA NIM strike a balance of quality and speed.

What to Look for in a Chat Model

Chat models are the most common type of LLM, but they vary significantly in quality for conversation use:

Latency / tokens per second — Real-time conversation needs fast responses. Groq's LPU hardware delivers the fastest inference (Llama 3.3 70B hits 100+ tok/s). NVIDIA NIM and OpenRouter are slower but offer more model variety.
Context window — Long conversations or document Q&A need a large context window. Gemini 2.5 Flash (1M ctx) can hold an entire book in memory. Most chat models have 32K–128K, which handles typical back-and-forth conversations easily.
Instruction following — A good chat model stays on-topic, follows system prompts, and avoids hallucinating. Llama 3.3 70B and Qwen3 are known for strong instruction adherence.
Multilingual support — If you chat in non-English languages, check the model's training data. Qwen3 has strong Chinese/English bilingual performance. Gemini and Llama support 30+ languages.
Multimodal input — Want to share images or audio in chat? Gemini 2.5 Flash accepts text, image, audio, and video. Most chat models are text-only.

How to Choose a Free Chat Model

Match the model to your chat use case:

Casual conversation / chatbot? → Prioritize latency and personality. Llama 3.3 70B via Groq (fastest) or Gemini 2.5 Flash via Google AI Studio (most capable).
Long-form Q&A / document chat? → Maximize context window. Gemini 2.5 Flash (1M) or Qwen3.5 122B (262K via NVIDIA NIM).
Multilingual chat? → Qwen3.5 excels in Chinese-English. Gemini supports 30+ languages. Llama covers major European and Asian languages.
Roleplay / creative conversation? → Look for models with strong creative writing. Llama 3.3 70B and Mistral models tend to have more varied output styles.
Customer support bot? → Instruction following and safety are critical. Gemini and Qwen3 are well-aligned. Avoid unmoderated open models unless you add guardrails.

Top Picks for Chat

Google: Gemini 2.5 Flash Google

1M context, multimodal, free tier with 10 RPM / 250 RPD. Best all-round chat model.

Meta: Llama 3.3 70B Instruct Groq

Fastest inference via Groq LPU, strong instruction following, no credit card.

Qwen: Qwen3.5 122B A10B NVIDIA NIM

262K context, strong bilingual (Chinese-English), 40 RPM with no daily cap.

NVIDIA: Nemotron 3 Super (free) OpenRouter

262K context, strong reasoning, solid chat performance.

All Free Chat Models

Provider	Model	Context	Max Output	Modality	Rate Limit	Released
OpenRouter	Cohere: North Mini Code (free)	256K	64K	textcode	200 req/day (free tier)	Jun 9, 2026	Details
OpenRouter	Nex AGI: Nex-N2-Pro	262K	262K	textimage	200 req/day (free tier)	Jun 2, 2026	Details
OpenRouter	NVIDIA: Nemotron 3.5 Content Safety (free)	128K	8K	textimagereasoning	200 req/day (free tier)	Jun 4, 2026	Details
OpenRouter	NVIDIA: Nemotron 3 Ultra (free)	1.0M	66K	textreasoning	200 req/day (free tier)	Jun 4, 2026	Details
OpenRouter	MiniMax: MiniMax M3	1.0M	512K	textimagevideoreasoning	200 req/day (free tier)	Jun 1, 2026	Details
OpenRouter	inclusionAI: Ring-2.6-1T	262K	66K	text	200 req/day (free tier)	May 8, 2026	Details
OpenRouter	NVIDIA: Nemotron 3 Nano Omni (free)	256K	66K	textimageaudiovideoreasoning	200 req/day (free tier)	Apr 28, 2026	Details
OpenRouter	Poolside: Laguna XS.2 (free)	262K	33K	text	200 req/day (free tier)	Apr 28, 2026	Details
OpenRouter	Poolside: Laguna M.1 (free)	262K	33K	text	200 req/day (free tier)	Apr 28, 2026	Details
OpenRouter	DeepSeek: DeepSeek V4 Flash	1.0M	66K	textreasoning	200 req/day (free tier)	Apr 24, 2026	Details
OpenRouter	MoonshotAI: Kimi K2.6	262K	262K	textimagevideoreasoning	200 req/day (free tier)	Apr 20, 2026	Details
OpenRouter	Z.ai: GLM 5.1	203K	8K	text	200 req/day (free tier)	Apr 7, 2026	Details
OpenRouter	Google: Gemma 4 26B A4B (free)	262K	33K	textimagereasoning	200 req/day (free tier)	Apr 2, 2026	Details
OpenRouter	Google: Gemma 4 31B (free)	262K	8K	textimagereasoning	200 req/day (free tier)	Apr 2, 2026	Details
OpenRouter	Arcee AI: Trinity Large Thinking	262K	80K	textreasoning	200 req/day (free tier)	Apr 1, 2026	Details
OpenRouter	Google: Lyria 3 Pro Preview	1.0M	66K	textimage	200 req/day (free tier)	Mar 30, 2026	Details
OpenRouter	Google: Lyria 3 Clip Preview	1.0M	66K	textimage	200 req/day (free tier)	Mar 30, 2026	Details
OpenRouter	NVIDIA: Nemotron 3 Super (free)	1.0M	262K	textreasoning	200 req/day (free tier)	Mar 11, 2026	Details
OpenRouter	MiniMax: MiniMax M2.5	205K	197K	textreasoning	200 req/day (free tier)	Feb 12, 2026	Details
OpenRouter	Free Models Router	200K	8K	textimage	200 req/day (free tier)	Feb 1, 2026	Details
OpenRouter	LiquidAI: LFM2.5-1.2B-Thinking (free)	33K	8K	textreasoning	200 req/day (free tier)	Jan 20, 2026	Details
OpenRouter	LiquidAI: LFM2.5-1.2B-Instruct (free)	33K	8K	text	200 req/day (free tier)	Jan 5, 2026	Details
OpenRouter	NVIDIA: Nemotron 3 Nano 30B A3B (free)	256K	8K	textreasoning	200 req/day (free tier)	Dec 14, 2025	Details
OpenRouter	OpenAI: gpt-oss-safeguard-20b	131K	66K	text	200 req/day (free tier)	Oct 29, 2025	Details
OpenRouter	NVIDIA: Nemotron Nano 12B 2 VL (free)	128K	128K	textimagevideoreasoning	200 req/day (free tier)	Oct 28, 2025	Details
OpenRouter	Qwen: Qwen3 Next 80B A3B Instruct (free)	262K	8K	text	200 req/day (free tier)	Sep 11, 2025	Details
OpenRouter	NVIDIA: Nemotron Nano 9B V2 (free)	128K	8K	textreasoning	200 req/day (free tier)	Sep 5, 2025	Details
OpenRouter	OpenAI: gpt-oss-120b (free)	131K	131K	textreasoning	200 req/day (free tier)	Aug 5, 2025	Details
OpenRouter	OpenAI: gpt-oss-20b (free)	131K	33K	text	200 req/day (free tier)	Aug 5, 2025	Details
OpenRouter	Z.ai: GLM 4.5 Air	131K	98K	textreasoning	200 req/day (free tier)	Jul 28, 2025	Details
OpenRouter	Qwen: Qwen3 Coder 480B A35B (free)	1.0M	262K	textcode	200 req/day (free tier)	Jul 23, 2025	Details
OpenRouter	Venice: Uncensored (free)	33K	8K	text	200 req/day (free tier)	Jul 9, 2025	Details
OpenRouter	Meta: Llama 3.3 70B Instruct (free)	131K	8K	text	200 req/day (free tier)	Dec 6, 2024	Details
OpenRouter	Meta: Llama 3.2 3B Instruct (free)	131K	8K	text	200 req/day (free tier)	Sep 25, 2024	Details
OpenRouter	Nous: Hermes 3 405B Instruct (free)	131K	8K	text	200 req/day (free tier)	Aug 16, 2024	Details
Chutes.ai	Llama 3.1 70B	131K	8K	text	Community-powered, no hard cap	Jul 23, 2024	Details
Glhf.chat	Llama 3.1 70B	131K	8K	text	Unlimited for free models	Jul 23, 2024	Details
Glhf.chat	Mixtral 8x7B	33K	0	text	Unlimited for free models	—	Details
Grok (xAI)	Grok-2	131K	0	text	$25/month free credits, resets monthly	Dec 12, 2024	Details
Grok (xAI)	Grok-2 Mini	131K	0	text	$25/month free credits, resets monthly	—	Details
Groq	Moonshot Kimi K2	131K	0	text	See provider page	Sep 5, 2025	Details
Groq	Moonshot Kimi K2 0905	131K	0	text	See provider page	Sep 5, 2025	Details
Groq	GPT-OSS 120B	131K	0	text	See provider page	Aug 5, 2025	Details
Groq	GPT-OSS 20B	131K	0	text	See provider page	Aug 5, 2025	Details
GitHub Models	Mistral Large (24.11)	131K	16K	textimage	See provider page	Feb 26, 2024	Details
GitHub Models	AI21 Jamba 1.5 Large	256K	0	text	See provider page	—	Details
Cerebras	Llama 3.1 70B	131K	8K	text	See provider page	Jul 23, 2024	Details
Mistral AI	Mistral 7B	33K	0	text	See provider page	—	Details
Mistral AI	Mixtral 8x7B	33K	0	text	See provider page	—	Details
Cloudflare Workers AI	Mistral 7B	33K	0	text	See provider page	—	Details
Cloudflare Workers AI	Qwen 1.5 7B	33K	0	text	See provider page	—	Details
Agnes AI	agnes-1.5-flash	256K	64K	textvision	30 RPM	—	Details
Agnes AI	agnes-2.0-flash	256K	64K	textvision	30 RPM	—	Details
Aion Labs	Aion 2.5	128K	32K	text	15 RPM, 20K TPD	—	Details
Aion Labs	Aion 2.0	128K	32K	text	15 RPM, 20K TPD	Feb 23, 2026	Details
Aion Labs	Aion-RP 1.0 (8B)	32K	8K	text	15 RPM, 20K TPD	—	Details
Cohere	Command A+ (218B)	128K	4K	text	20 RPM	—	Details
Cohere	Command A (111B)	256K	4K	text	20 RPM	—	Details
Cohere	Command R+	128K	4K	text	20 RPM	—	Details
Cohere	Command R7B	128K	4K	text	20 RPM	—	Details
Google Gemini	Gemini 3.5 Flash	1.0M	64K	textimagevideoaudiopdfreasoning	15 RPM, 1,500 RPD	May 19, 2026	Details
Google Gemini	Gemini 3.1 Flash-Lite	1.0M	65K	textimagevideoaudiopdfreasoning	30 RPM, 1,500 RPD	Mar 3, 2026	Details
Google Gemini	Gemini 2.5 Flash	1.0M	65K	textimageaudiovideopdfreasoning	15 RPM, 1,500 RPD	May 20, 2025	Details
Google Gemini	Gemini 2.5 Pro	2.0M	65K	textimageaudiovideopdfreasoning	5 RPM, 50 RPD	Jun 5, 2025	Details
Mistral AI	Mistral Medium 3.5 (128B)	256K	256K	text	~1 RPS, 500K TPM	—	Details
Mistral AI	Mistral Small 4	256K	256K	text	~1 RPS, 500K TPM	Mar 16, 2026	Details
Mistral AI	Mistral Large 3	256K	256K	text	~1 RPS, 500K TPM	Dec 2, 2025	Details
Mistral AI	Mistral Nemo (12B)	128K	128K	text	~1 RPS, 500K TPM	Jul 1, 2024	Details
Mistral AI	Codestral	256K	256K	textcode	~1 RPS, 500K TPM	—	Details
Mistral AI	Pixtral Large	128K	128K	textimage	~1 RPS, 500K TPM	Nov 18, 2024	Details
Z AI (Zhipu AI)	GLM-4.7-Flash	200K	128K	textreasoning	1 concurrent request	Jan 19, 2026	Details
Z AI (Zhipu AI)	GLM-4.6V-Flash	128K	4K	textimagevideoreasoning	1 concurrent request	Dec 8, 2025	Details
Cerebras	gpt-oss-120b	128K	8K	textreasoning	30 RPM, 14,400 RPD, 1M TPD	Aug 5, 2025	Details
Cerebras	zai-glm-4.7	128K	8K	textreasoning	10 RPM, 100 RPD, 1M TPD	Dec 22, 2025	Details
Cloudflare Workers AI	@cf/meta/llama-3.3-70b-instruct-fp8-fast	131K	131K	text	10K neurons/day (shared)	Dec 6, 2024	Details
Cloudflare Workers AI	@cf/meta/llama-4-scout-17b-16e-instruct	10.0M	131K	text	10K neurons/day (shared)	—	Details
Cloudflare Workers AI	@cf/openai/gpt-oss-120b	128K	131K	text	10K neurons/day (shared)	—	Details
Cloudflare Workers AI	@cf/moonshotai/kimi-k2.7-code	262K	131K	textcodeimagevideoreasoning	10K neurons/day (shared)	Jun 12, 2026	Details
Cloudflare Workers AI	@cf/google/gemma-4-26b-a4b-it	256K	131K	textimagereasoning	10K neurons/day (shared)	Apr 2, 2026	Details
Cloudflare Workers AI	@cf/zhipuai/glm-4.7-flash	131K	131K	textreasoning	10K neurons/day (shared)	Jan 19, 2026	Details
Cloudflare Workers AI	@cf/mistralai/mistral-small-3.1-24b-instruct	128K	131K	text	10K neurons/day (shared)	Mar 17, 2025	Details
Cloudflare Workers AI	@cf/deepseek-ai/deepseek-r1-distill-qwen-32b	32K	131K	textreasoning	10K neurons/day (shared)	Jan 20, 2025	Details
GitHub Models	gpt-5	200K	32K	textimagereasoning	10 RPM, 50 RPD	Aug 7, 2025	Details
GitHub Models	gpt-4.1	1.0M	32K	textimagepdf	10 RPM, 50 RPD	Apr 14, 2025	Details
GitHub Models	gpt-4.1-mini	1.0M	32K	textimagepdf	15 RPM, 150 RPD	Apr 14, 2025	Details
GitHub Models	gpt-4o	128K	16K	textimagepdf	10 RPM, 50 RPD	May 13, 2024	Details
GitHub Models	o4-mini	200K	100K	textimagereasoning	10 RPM, 50 RPD	Apr 16, 2025	Details
GitHub Models	Llama-4-Scout-17B-16E	512K	4K	textimage	15 RPM, 150 RPD	Apr 5, 2025	Details
GitHub Models	Llama-4-Maverick-17B-128E	256K	4K	textimage	10 RPM, 50 RPD	Apr 5, 2025	Details
GitHub Models	Meta-Llama-3.3-70B	131K	4K	text	15 RPM, 150 RPD	Dec 6, 2024	Details
GitHub Models	DeepSeek-R1	64K	8K	textreasoning	15 RPM, 150 RPD	May 28, 2025	Details
GitHub Models	Mistral-Small-3.1	128K	4K	text	15 RPM, 150 RPD	Mar 17, 2025	Details
Groq	llama-3.3-70b-versatile	131K	32K	text	30 RPM, 1,000 RPD	Dec 6, 2024	Details
Groq	llama-3.1-8b-instant	131K	131K	text	30 RPM, 1,000 RPD	Jul 23, 2024	Details
Groq	llama-4-scout-17b-16e-instruct	131K	8K	textimage	30 RPM, 1,000 RPD	Apr 5, 2025	Details
Groq	qwen3-32b	131K	131K	textreasoning	30 RPM, 1,000 RPD	Apr 28, 2025	Details
Hugging Face	Meta-Llama-3.1-8B-Instruct	128K	4K	text	Credit-metered	Jul 23, 2024	Details
Hugging Face	Mistral-7B-Instruct-v0.3	32K	4K	text	Credit-metered	—	Details
Hugging Face	Mixtral-8x7B-Instruct-v0.1	32K	4K	text	Credit-metered	—	Details
Hugging Face	Phi-3.5-mini-instruct	128K	4K	text	Credit-metered	—	Details
Hugging Face	Qwen2.5-7B-Instruct	131K	4K	text	Credit-metered	Oct 16, 2024	Details
Kilo Code	x-ai/grok-code-fast-1:free	256K	131K	textcode	~200 req/hr	Aug 28, 2025	Details
Kilo Code	minimax/minimax-m2.5:free	196K	8K	textreasoning	~200 req/hr	Feb 12, 2026	Details
Kilo Code	bytedance-seed/dola-seed-2.0-pro:free	131K	131K	text	~200 req/hr	—	Details
Kilo Code	nvidia/nemotron-3-super-120b-a12b:free	262K	32K	textreasoning	~200 req/hr	Mar 11, 2026	Details
Kilo Code	arcee-ai/trinity-large-thinking:free	131K	131K	textreasoning	~200 req/hr	Apr 1, 2026	Details
LLM7.io	deepseek-r1-0528	131K	131K	textreasoning	30 RPM (120 with token)	May 28, 2025	Details
LLM7.io	deepseek-v3-0324	131K	131K	text	30 RPM (120 with token)	Mar 25, 2025	Details
LLM7.io	gemini-2.5-flash-lite	131K	131K	textimageaudiovideopdfreasoning	30 RPM (120 with token)	Jun 17, 2025	Details
LLM7.io	gpt-4o-mini	131K	131K	textimagepdf	30 RPM (120 with token)	Jul 18, 2024	Details
LLM7.io	mistral-small-3.1-24b	32K	131K	text	30 RPM (120 with token)	Mar 17, 2025	Details
LLM7.io	qwen2.5-coder-32b	131K	131K	textcode	30 RPM (120 with token)	Nov 11, 2024	Details
ModelScope	Qwen/Qwen3.5-35B-A3B	131K	131K	textimagevideoaudioreasoning	2,000 RPD total; <=500 RPD/model (dynamic)	Feb 24, 2026	Details
ModelScope	Qwen/Qwen3.5-27B	131K	131K	textimagevideoaudioreasoning	2,000 RPD total; <=500 RPD/model (dynamic)	Feb 24, 2026	Details
Ollama Cloud	gpt-oss:120b-cloud	128K	131K	textreasoning	Session/weekly limits (unpublished)	Aug 5, 2025	Details
Ollama Cloud	deepseek-v3.1:671b-cloud	128K	131K	text	Session/weekly limits (unpublished)	—	Details
Ollama Cloud	qwen3-coder:480b-cloud	128K	131K	textcode	Session/weekly limits (unpublished)	—	Details
Ollama Cloud	kimi-k2:1t-cloud	262K	131K	text	Session/weekly limits (unpublished)	—	Details
Ollama Cloud	glm-4.6:cloud	128K	131K	textreasoning	Session/weekly limits (unpublished)	Sep 30, 2025	Details
Ollama Cloud	deepseek-r1:cloud	128K	131K	textreasoning	Session/weekly limits (unpublished)	Jan 20, 2025	Details
OVHcloud AI Endpoints	Qwen3.5-397B-A17B	131K	32K	textimagevideoaudioreasoning	2 RPM (anonymous)	Feb 16, 2026	Details
OVHcloud AI Endpoints	gpt-oss-20b	128K	8K	text	2 RPM (anonymous)	Aug 5, 2025	Details
OVHcloud AI Endpoints	Meta-Llama-3_3-70B-Instruct	131K	4K	text	2 RPM (anonymous)	Dec 6, 2024	Details
OVHcloud AI Endpoints	Llama-3.1-8B-Instruct	131K	4K	text	2 RPM (anonymous)	Jul 23, 2024	Details
OVHcloud AI Endpoints	Qwen3.6-27B	131K	32K	textimagevideoaudioreasoning	2 RPM (anonymous)	Apr 22, 2026	Details
OVHcloud AI Endpoints	Qwen3.5-9B	131K	8K	textreasoning	2 RPM (anonymous)	Mar 2, 2026	Details
OVHcloud AI Endpoints	Qwen3-Coder-30B-A3B-Instruct	262K	32K	textcode	2 RPM (anonymous)	Jul 31, 2025	Details
OVHcloud AI Endpoints	Qwen2.5-VL-72B-Instruct	128K	8K	textimage	2 RPM (anonymous)	Sep 1, 2024	Details
OVHcloud AI Endpoints	Mistral-Small-3.2-24B-Instruct	128K	4K	text	2 RPM (anonymous)	Jun 20, 2025	Details
OVHcloud AI Endpoints	Mistral-Nemo-Instruct-2407	128K	4K	text	2 RPM (anonymous)	Jul 1, 2024	Details
SambaNova	DeepSeek-V3.1	128K	8K	text	20 RPM, 20 RPD, 200K TPD	Aug 21, 2025	Details
SambaNova	DeepSeek-V3.2 (Preview)	128K	8K	text	20 RPM, 20 RPD, 200K TPD	—	Details
SambaNova	MiniMax-M2.7	128K	8K	textreasoning	20 RPM, 20 RPD, 200K TPD	Mar 18, 2026	Details
SambaNova	gemma-4-31B-it (Preview)	128K	8K	textimagereasoning	20 RPM, 20 RPD, 200K TPD	Apr 2, 2026	Details
SiliconFlow	deepseek-ai/DeepSeek-R1-Distill-Qwen-7B	131K	131K	textreasoning	30 RPM, 60K TPM	—	Details
SiliconFlow	Abbreviation	131K	8K	text	See provider page	—	Details
NVIDIA NIM	01-ai/yi-large	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	adept/fuyu-8b	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	ai21labs/jamba-1.5-large-instruct	131K	8K	text	Up to 40 RPM	Aug 22, 2024	Details
NVIDIA NIM	aisingapore/sea-lion-7b-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	baai/bge-m3	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	bigcode/starcoder2-15b	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	databricks/dbrx-instruct	131K	8K	text	Up to 40 RPM	Mar 27, 2024	Details
NVIDIA NIM	deepseek-ai/deepseek-coder-6.7b-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	deepseek-ai/deepseek-v4-flash	1.0M	66K	textreasoning	Up to 40 RPM	Apr 24, 2026	Details
NVIDIA NIM	deepseek-ai/deepseek-v4-pro	1.0M	384K	textreasoning	Up to 40 RPM	Apr 24, 2026	Details
NVIDIA NIM	google/codegemma-1.1-7b	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	google/codegemma-7b	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	google/deplot	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	google/gemma-2b	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	google/recurrentgemma-2b	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	ibm/granite-3.0-3b-a800m-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	ibm/granite-3.0-8b-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	ibm/granite-34b-code-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	ibm/granite-8b-code-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	meta/codellama-70b	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	meta/llama-3.1-70b-instruct	131K	16K	text	Up to 40 RPM	Jul 23, 2024	Details
NVIDIA NIM	meta/llama-3.2-11b-vision-instruct	131K	16K	textimage	Up to 40 RPM	Sep 25, 2024	Details
NVIDIA NIM	meta/llama-3.2-1b-instruct	131K	60K	text	Up to 40 RPM	Sep 25, 2024	Details
NVIDIA NIM	meta/llama-3.2-3b-instruct	131K	8K	text	Up to 40 RPM	Sep 25, 2024	Details
NVIDIA NIM	meta/llama-guard-4-12b	164K	16K	textimage	Up to 40 RPM	Apr 30, 2025	Details
NVIDIA NIM	meta/llama2-70b	131K	8K	text	Up to 40 RPM	Jul 18, 2023	Details
NVIDIA NIM	microsoft/kosmos-2	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	microsoft/phi-3-vision-128k-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	microsoft/phi-3.5-moe-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	microsoft/phi-4-multimodal-instruct	131K	8K	text	Up to 40 RPM	Feb 26, 2025	Details
NVIDIA NIM	minimaxai/minimax-m2.7	205K	197K	textreasoning	Up to 40 RPM	Mar 18, 2026	Details
NVIDIA NIM	minimaxai/minimax-m3	1.0M	512K	textimagevideoreasoning	Up to 40 RPM	Jun 1, 2026	Details
NVIDIA NIM	mistralai/codestral-22b-instruct-v0.1	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	mistralai/mistral-7b-instruct-v0.3	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	mistralai/mistral-large-2-instruct	131K	8K	text	Up to 40 RPM	Nov 18, 2024	Details
NVIDIA NIM	mistralai/mixtral-8x22b-v0.1	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	moonshotai/kimi-k2.6	262K	262K	textimagevideoreasoning	Up to 40 RPM	Apr 20, 2026	Details
NVIDIA NIM	nv-mistralai/mistral-nemo-12b-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/cosmos-reason2-8b	131K	8K	textreasoning	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/embed-qa-4	131K	8K	embedding	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/llama-3.1-nemotron-51b-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/llama-3.1-nemotron-70b-instruct	131K	8K	text	Up to 40 RPM	Oct 15, 2024	Details
NVIDIA NIM	nvidia/llama-3.1-nemotron-ultra-253b-v1	131K	8K	textreasoning	Up to 40 RPM	Apr 7, 2025	Details
NVIDIA NIM	nvidia/llama-3.2-nemoretriever-1b-vlm-embed-v1	131K	8K	embeddingrerank	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/llama-3.2-nv-embedqa-1b-v1	131K	8K	embedding	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/llama-3.3-nemotron-super-49b-v1.5	131K	16K	textreasoning	Up to 40 RPM	Jul 25, 2025	Details
NVIDIA NIM	nvidia/llama-nemotron-embed-1b-v2	131K	8K	embeddingtextimage	Up to 40 RPM	Feb 10, 2026	Details
NVIDIA NIM	nvidia/llama-nemotron-embed-vl-1b-v2	131K	8K	embeddingtextimage	Up to 40 RPM	Feb 10, 2026	Details
NVIDIA NIM	nvidia/llama3-chatqa-1.5-70b	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/mistral-nemo-minitron-8b-8k-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/nemoretriever-parse	131K	8K	rerank	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/nemotron-3.5-content-safety	128K	8K	textimagereasoning	Up to 40 RPM	Jun 4, 2026	Details
NVIDIA NIM	nvidia/nemotron-4-340b-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/nemotron-4-340b-reward	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/nemotron-nano-3-30b-a3b	131K	8K	textreasoning	Up to 40 RPM	Dec 15, 2025	Details
NVIDIA NIM	nvidia/nemotron-parse	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/neva-22b	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/nv-embed-v1	131K	8K	embedding	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/nv-embedcode-7b-v1	131K	8K	embedding	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/nv-embedqa-e5-v5	131K	8K	embedding	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/nv-embedqa-mistral-7b-v2	131K	8K	embedding	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/nvclip	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/riva-translate-4b-instruct	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	nvidia/vila	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	qwen/qwen3.5-122b-a10b	262K	262K	textimagevideoaudioreasoning	Up to 40 RPM	Feb 24, 2026	Details
NVIDIA NIM	qwen/qwen3.5-397b-a17b	256K	8K	textimagevideoaudioreasoning	Up to 40 RPM	Feb 16, 2026	Details
NVIDIA NIM	snowflake/arctic-embed-l	131K	8K	embedding	Up to 40 RPM	—	Details
NVIDIA NIM	stepfun-ai/step-3.5-flash	262K	66K	textreasoning	Up to 40 RPM	Feb 2, 2026	Details
NVIDIA NIM	stepfun-ai/step-3.7-flash	256K	256K	textimagereasoning	Up to 40 RPM	May 29, 2026	Details
NVIDIA NIM	writer/palmyra-creative-122b	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	writer/palmyra-fin-70b-32k	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	writer/palmyra-med-70b	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	writer/palmyra-med-70b-32k	131K	8K	text	Up to 40 RPM	—	Details
NVIDIA NIM	z-ai/glm-5.1	203K	8K	text	Up to 40 RPM	Apr 7, 2026	Details
NVIDIA NIM	zyphra/zamba2-7b-instruct	131K	8K	text	Up to 40 RPM	—	Details
AI21 Labs	Jamba Large 1.7	256K	4K	text	200 RPM, 10 RPS	Aug 8, 2025	Details
AI21 Labs	Jamba Mini 2	256K	4K	text	200 RPM, 10 RPS	—	Details
Aion Labs	aion-1.0	131K	32K	text	Daily token allowance	Feb 4, 2025	Details
Aion Labs	aion-1.0-mini	131K	32K	text	Daily token allowance	Feb 4, 2025	Details
Alibaba Cloud Model Studio	Qwen3-Max	128K	32K	text	Tiered by region	Sep 23, 2025	Details
Alibaba Cloud Model Studio	Qwen3-Plus	1.0M	32K	text	Tiered by region	—	Details
Alibaba Cloud Model Studio	Qwen3-VL-Plus	128K	8K	textimage	Tiered by region	—	Details
Alibaba Cloud Model Studio	Qwen3-Coder-Plus	256K	8K	textcode	Tiered by region	Sep 23, 2025	Details
Alibaba Cloud Model Studio	QwQ-Plus	131K	32K	text	Tiered by region	—	Details
Cohere	Embed 4	131K	131K	textembedding	2,000 inputs/min	—	Details
Cohere	Rerank 3.5	131K	131K	textrerank	10 RPM	—	Details
DeepSeek	deepseek-chat (V3.2)	128K	8K	text	Dynamic	Dec 1, 2025	Details
DeepSeek	deepseek-reasoner (R1)	128K	8K	textreasoning	Dynamic	—	Details
Google Gemini	Gemini 3 Flash (Preview)	1.0M	65K	text	Preview limits	—	Details
Mistral AI	Mistral Medium 3	128K	128K	text	~1 RPS, 500K TPM	May 7, 2025	Details
xAI	grok-4.3	1.0M	32K	text	Credit-based	Apr 30, 2026	Details
xAI	grok-4.1-fast	2.0M	32K	text	Credit-based	Nov 19, 2025	Details
xAI	grok-3-mini	131K	8K	text	Credit-based	—	Details
Z AI (Zhipu AI)	GLM-4.5-Flash	128K	8K	text	1 concurrent request	—	Details
Cerebras	llama-3.3-70b	128K	8K	text	30 RPM, 14,400 RPD, 1M TPD	Dec 6, 2024	Details
Cerebras	qwen-3-235b-a22b-instruct-2507	131K	8K	text	30 RPM, 14,400 RPD, 1M TPD	Apr 28, 2025	Details
Cerebras	qwen-3-32b	131K	8K	text	30 RPM, 14,400 RPD, 1M TPD	Apr 28, 2025	Details
Cloudflare Workers AI	@cf/meta/llama-3.1-8b-instruct-fp8-fast	131K	131K	text	10K neurons/day (shared)	Jul 23, 2024	Details
Cloudflare Workers AI	@cf/meta/llama-3.2-11b-vision-instruct	131K	131K	textimage	10K neurons/day (shared)	Sep 25, 2024	Details
Cloudflare Workers AI	@cf/moonshotai/kimi-k2.5	256K	131K	text	10K neurons/day (shared)	—	Details
Groq	llama-4-maverick-17b-128e-instruct	131K	8K	text	15 RPM, 500 RPD	—	Details
Groq	kimi-k2-instruct	262K	262K	text	30 RPM, 14,400 RPD	Sep 5, 2025	Details
Groq	deepseek-r1-distill-70b	131K	8K	textreasoning	30 RPM, 14,400 RPD	—	Details
Groq	whisper-large-v3	131K	131K	text	20 RPM, 2,000 RPD	—	Details
Groq	whisper-large-v3-turbo	131K	131K	text	20 RPM, 2,000 RPD	—	Details
ModelScope	Qwen/Qwen-Image	131K	131K	text	2,000 RPD total; model/AIGC-specific caps	—	Details
Nebius	Qwen3-235B-A22B	128K	32K	text	Tier-based	Apr 28, 2025	Details
Nscale	Llama-3.3-70B-Instruct	128K	8K	text	Fair-use	Dec 6, 2024	Details
Nscale	DeepSeek-R1-Distill-Llama-70B	128K	32K	textreasoning	Fair-use	Jan 20, 2025	Details
OVHcloud AI Endpoints	Qwen3Guard-Gen-8B	32K	4K	text	2 RPM (anonymous)	—	Details
OVHcloud AI Endpoints	Qwen3Guard-Gen-0.6B	32K	4K	text	2 RPM (anonymous)	—	Details
SiliconFlow	deepseek-ai/DeepSeek-OCR	131K	8K	text	30 RPM, 60K TPM	—	Details
OpenRouter	Baidu Qianfan: CoBuddy	131K	65K	textcode	200 req/day (free tier)	—	Details
OpenRouter	NVIDIA: Llama Nemotron Embed VL 1B V2 (free)	131K	8K	textimageembedding	200 req/day (free tier)	Feb 25, 2026	Details
OpenRouter	NVIDIA: Llama Nemotron Rerank VL 1B V2 (free)	10K	8K	textimagererank	200 req/day (free tier)	Jun 9, 2026	Details
OpenCode Zen	big-pickle	N/A	N/A			—	Details
OpenCode Zen	DeepSeek V4 Flash	1.0M	384K	reasoning		—	Details
OpenCode Zen	MiMo-V2.5	1.0M	131K	visionaudioreasoning		—	Details
OpenCode Zen	Nemotron 3 Ultra 550B A55B	1.0M	128K	reasoning		—	Details
OpenCode Zen	North Mini Code	256K	64K	reasoning		—	Details
ModelScope	deepseek-ai/DeepSeek-V3.2	8K	4K			Dec 1, 2025	Details
LLM7.io	Codestral (latest)	256K	4K	text		—	Details
ModelScope	deepseek-ai/DeepSeek-V4-Flash	8K	4K			Apr 24, 2026	Details
Cloudflare Workers AI	@cf/openai/gpt-oss-120b	8K	4K			—	Details
ModelScope	deepseek-ai/DeepSeek-V4-Pro	8K	4K			Apr 24, 2026	Details
Cloudflare Workers AI	@cf/baai/bge-m3	8K	4K			—	Details
Google Gemini	Gemini 2.5 Flash	1.0M	66K	visionaudioreasoning		Jun 17, 2025	Details
LLM7.io	devstral-small-2:24b	8K	4K			—	Details
Cloudflare Workers AI	@cf/google/gemma-2b-it-lora	8K	4K			—	Details
ModelScope	LLM-Research/c4ai-command-r-plus-08-2024	8K	4K			—	Details
ModelScope	LLM-Research/Llama-4-Maverick-17B-128E-Instruct	8K	4K			—	Details
Cloudflare Workers AI	@cf/meta/llama-3.2-3b-instruct	8K	4K			Sep 25, 2024	Details
ModelScope	MedAIBase/AntAngelMed	8K	4K			—	Details
Cloudflare Workers AI	@cf/meta/llama-guard-3-8b	8K	4K			—	Details
ModelScope	meituan-longcat/LongCat-Flash-Lite	8K	4K			—	Details
Cloudflare Workers AI	@cf/qwen/qwen3-embedding-0.6b	8K	4K	embedding		—	Details
ModelScope	MiniMax/MiniMax-M1-80k	8K	4K			—	Details
ModelScope	MiniMax-M2.5-highspeed	205K	131K	reasoning		—	Details
Cloudflare Workers AI	@cf/mistral/mistral-7b-instruct-v0.2-lora	8K	4K			—	Details
ModelScope	MiniMax-M3	512K	128K	visionreasoning		—	Details
ModelScope	mistralai/Ministral-8B-Instruct-2410	8K	4K			—	Details
ModelScope	mistralai/Mistral-Large-Instruct-2407	8K	4K			Nov 19, 2024	Details
ModelScope	mistralai/Mistral-Small-Instruct-2409	8K	4K			—	Details
ModelScope	Kimi K2.5	262K	262K	visionreasoning		—	Details
ModelScope	MusePublic/Qwen-Image-Edit	8K	4K			—	Details
ModelScope	opencompass/CompassJudger-1-32B-Instruct	8K	4K			—	Details
ModelScope	OpenGVLab/InternVL3_5-241B-A28B	8K	4K			—	Details
ModelScope	PaddlePaddle/ERNIE-4.5-0.3B-PT	8K	4K			—	Details
ModelScope	PaddlePaddle/ERNIE-4.5-21B-A3B-PT	8K	4K			—	Details
ModelScope	PaddlePaddle/ERNIE-4.5-300B-A47B-PT	8K	4K			—	Details
ModelScope	PaddlePaddle/ERNIE-4.5-VL-28B-A3B-PT	8K	4K			—	Details
ModelScope	Qwen/Qwen-Image-Edit	8K	4K			—	Details
ModelScope	Qwen/Qwen3-14B	8K	4K			Apr 28, 2025	Details
ModelScope	Qwen/Qwen3-235B-A22B	8K	4K			Apr 28, 2025	Details
Cloudflare Workers AI	@cf/moonshotai/kimi-k2.7-code	8K	4K			—	Details
ModelScope	Qwen/Qwen3-235B-A22B-Instruct-2507	8K	4K			Jul 21, 2025	Details
Cloudflare Workers AI	@cf/pfnet/plamo-embedding-1b	8K	4K	embedding		—	Details
ModelScope	Qwen/Qwen3-235B-A22B-Thinking-2507	8K	4K	reasoning		Jul 25, 2025	Details
Cloudflare Workers AI	@cf/deepseek-ai/deepseek-r1-distill-qwen-32b	8K	4K	reasoning		Jan 29, 2025	Details
ModelScope	Qwen/Qwen3-30B-A3B	8K	4K			Apr 28, 2025	Details
ModelScope	Qwen/Qwen3-30B-A3B-Thinking-2507	8K	4K	reasoning		Aug 28, 2025	Details
ModelScope	Qwen/Qwen3-32B	8K	4K			Apr 28, 2025	Details
ModelScope	Qwen/Qwen3-4B	8K	4K			—	Details
Cloudflare Workers AI	@cf/meta/llama-3.1-8b-instruct-fp8	8K	4K			Jul 23, 2024	Details
ModelScope	Qwen/Qwen3-8B	8K	4K			Apr 28, 2025	Details
Cloudflare Workers AI	@cf/meta/llama-3.2-1b-instruct	8K	4K			Sep 25, 2024	Details
ModelScope	Qwen/Qwen3-Coder-30B-A3B-Instruct	8K	4K			Jul 31, 2025	Details
Cloudflare Workers AI	@cf/moonshotai/kimi-k2.6	8K	4K			—	Details
ModelScope	Qwen/Qwen3-Next-80B-A3B-Instruct	8K	4K			Sep 11, 2025	Details
Cloudflare Workers AI	@cf/zai-org/glm-4.7-flash	8K	4K			—	Details
ModelScope	Qwen/Qwen3-Next-80B-A3B-Thinking	8K	4K	reasoning		Sep 11, 2025	Details
ModelScope	Qwen/Qwen3-VL-235B-A22B-Instruct	8K	4K			Sep 23, 2025	Details
ModelScope	Qwen/Qwen3-VL-8B-Instruct	8K	4K			Oct 14, 2025	Details
Cloudflare Workers AI	@cf/meta-llama/llama-2-7b-chat-hf-lora	8K	4K			—	Details
Cloudflare Workers AI	@cf/meta/llama-3.3-70b-instruct-fp8-fast	8K	4K			Dec 6, 2024	Details
ModelScope	Qwen/Qwen3-VL-8B-Thinking	8K	4K	reasoning		Oct 14, 2025	Details
Cloudflare Workers AI	@cf/ibm-granite/granite-4.0-h-micro	8K	4K			—	Details
ModelScope	Qwen/Qwen3.5-122B-A10B	8K	4K			Feb 25, 2026	Details
ModelScope	Qwen/Qwen3.5-27B	8K	4K			Feb 25, 2026	Details
Z AI (Zhipu AI)	GLM-4.5-Air	131K	98K	reasoning		Jul 25, 2025	Details
ModelScope	Qwen/Qwen3.5-35B-A3B	8K	4K			Feb 25, 2026	Details
Cloudflare Workers AI	@cf/baai/bge-small-en-v1.5	8K	4K			—	Details
Cloudflare Workers AI	@cf/qwen/qwen2.5-coder-32b-instruct	8K	4K			Nov 11, 2024	Details
ModelScope	Qwen/Qwen3.5-397B-A17B	8K	4K			Feb 16, 2026	Details
ModelScope	Shanghai_AI_Laboratory/Intern-S1	8K	4K			—	Details
ModelScope	Shanghai_AI_Laboratory/Intern-S1-mini	8K	4K			—	Details
ModelScope	Shanghai_AI_Laboratory/Intern-S2-Preview	8K	4K			—	Details
ModelScope	stepfun-ai/Step-3.5-Flash	8K	4K			—	Details
ModelScope	stepfun-ai/Step-3.7-Flash	8K	4K			—	Details
ModelScope	XGenerationLab/XiYanSQL-QwenCoder-32B-2412	8K	4K			—	Details
ModelScope	XGenerationLab/XiYanSQL-QwenCoder-32B-2504	8K	4K			—	Details
ModelScope	GLM-4.7-FlashX	200K	131K	reasoning		—	Details
ModelScope	GLM-5.1	200K	131K	reasoning		—	Details
ModelScope	GLM-5.1	200K	131K	reasoning		—	Details
ModelScope	GLM-5.2	1.0M	131K	reasoning		—	Details
Cloudflare Workers AI	@cf/zai-org/glm-5.2	8K	4K			—	Details
Cloudflare Workers AI	@cf/nvidia/nemotron-3-120b-a12b	8K	4K			—	Details
Cloudflare Workers AI	@cf/baai/bge-base-en-v1.5	8K	4K			—	Details
Cloudflare Workers AI	@cf/aisingapore/gemma-sea-lion-v4-27b-it	8K	4K			—	Details
Cloudflare Workers AI	@cf/qwen/qwen3-30b-a3b-fp8	8K	4K			Apr 28, 2025	Details
Cloudflare Workers AI	@cf/google/gemma-7b-it-lora	8K	4K			—	Details
Cloudflare Workers AI	@cf/google/gemma-4-26b-a4b-it	8K	4K			Apr 3, 2026	Details
Cloudflare Workers AI	@cf/mistralai/mistral-small-3.1-24b-instruct	8K	4K			Mar 17, 2025	Details
Cloudflare Workers AI	@cf/openai/gpt-oss-20b	8K	4K			—	Details
Cloudflare Workers AI	@cf/google/embeddinggemma-300m	8K	4K	embedding		—	Details
Cloudflare Workers AI	@cf/meta/llama-4-scout-17b-16e-instruct	8K	4K			—	Details
Cloudflare Workers AI	@cf/qwen/qwq-32b	8K	4K			—	Details
Cloudflare Workers AI	@cf/baai/bge-large-en-v1.5	8K	4K			—	Details
NVIDIA NIM	abacusai/dracarys-llama-3.1-70b-instruct	8K	4K			—	Details
NVIDIA NIM	bytedance/seed-oss-36b-instruct	8K	4K			—	Details
NVIDIA NIM	google/diffusiongemma-26b-a4b-it	8K	4K			—	Details
NVIDIA NIM	google/gemma-2-2b-it	8K	4K			—	Details
NVIDIA NIM	google/gemma-3n-e2b-it	8K	4K			—	Details
OpenRouter	Gemma 4 31B IT	262K	33K	visionreasoning	200 req/day (free tier)	Apr 2, 2026	Details
NVIDIA NIM	meta/llama-3.1-8b-instruct	8K	4K			Jul 23, 2024	Details
NVIDIA NIM	meta/llama-3.2-90b-vision-instruct	8K	4K			—	Details
NVIDIA NIM	Llama-3.3-70B-Instruct	128K	4K	text		Dec 6, 2024	Details
NVIDIA NIM	meta/llama-4-maverick-17b-128e-instruct	8K	4K			—	Details
NVIDIA NIM	microsoft/phi-4-mini-instruct	8K	4K			—	Details
NVIDIA NIM	mistralai/ministral-14b-instruct-2512	8K	4K			Dec 2, 2025	Details
NVIDIA NIM	mistralai/mistral-large-3-675b-instruct-2512	8K	4K			—	Details
NVIDIA NIM	mistralai/mistral-medium-3.5-128b	8K	4K			—	Details
NVIDIA NIM	mistralai/mistral-nemotron	8K	4K			—	Details
NVIDIA NIM	mistralai/mistral-small-4-119b-2603	8K	4K			—	Details
NVIDIA NIM	mistralai/mixtral-8x7b-instruct-v0.1	8K	4K			—	Details
NVIDIA NIM	nvidia/gliner-pii	8K	4K			—	Details
NVIDIA NIM	nvidia/ising-calibration-1-35b-a3b	8K	4K			—	Details
NVIDIA NIM	nvidia/llama-3.1-nemoguard-8b-content-safety	8K	4K			—	Details
NVIDIA NIM	nvidia/llama-3.1-nemoguard-8b-topic-control	8K	4K			—	Details
NVIDIA NIM	nvidia/llama-3.1-nemotron-nano-vl-8b-v1	8K	4K			—	Details
NVIDIA NIM	Llama 3.1 Nemotron Safety Guard 8B v3	128K	4K	text		—	Details
NVIDIA NIM	Llama 3.3 Nemotron Super 49B v1	131K	131K	reasoning		—	Details
NVIDIA NIM	Nemotron 3 Content Safety	128K	4K	text		—	Details
OpenRouter	Nemotron 3 Nano 30B A3B	262K	262K	reasoning	200 req/day (free tier)	Dec 14, 2025	Details
NVIDIA NIM	Nemotron 3 Nano Omni 30B A3B Reasoning	256K	66K	visionaudioreasoning		Apr 28, 2026	Details
OpenRouter	Nemotron 3 Super 120B A12B	262K	262K	reasoning	200 req/day (free tier)	Mar 11, 2026	Details
OpenRouter	Nemotron 3 Ultra 550B A55B	1.0M	128K	reasoning	200 req/day (free tier)	Jun 4, 2026	Details
NVIDIA NIM	Nemotron Content Safety Reasoning 4B	128K	4K	reasoning		—	Details
NVIDIA NIM	Nemotron Mini 4B Instruct	128K	8K	text		—	Details
NVIDIA NIM	Nemotron Nano 12B v2 VL	128K	128K	visionreasoning		Oct 28, 2025	Details
NVIDIA NIM	nvidia/nvidia-nemotron-nano-9b-v2	8K	4K			—	Details
NVIDIA NIM	nvidia/riva-translate-4b-instruct-v1.1	8K	4K			—	Details
OpenRouter	GPT OSS 120B	131K	33K	reasoning	200 req/day (free tier)	—	Details
OpenRouter	openai/gpt-oss-20b	8K	4K		200 req/day (free tier)	—	Details
OpenRouter	qwen/qwen3-next-80b-a3b-instruct	8K	4K		200 req/day (free tier)	Sep 11, 2025	Details
NVIDIA NIM	sarvamai/sarvam-m	8K	4K			—	Details
NVIDIA NIM	stockmark/stockmark-2-100b-instruct	8K	4K			—	Details
NVIDIA NIM	upstage/solar-10.7b-instruct	8K	4K			—	Details
Google Gemini	Gemma 4 26B A4B IT	262K	33K	visionreasoning		Apr 3, 2026	Details
Google Gemini	Gemma 4 31B IT	262K	33K	visionreasoning		Apr 2, 2026	Details
Google Gemini	Gemini Flash-Lite Latest	1.0M	66K	visionaudioreasoning		—	Details
Google Gemini	Gemini 2.5 Flash-Lite	1.0M	66K	visionaudioreasoning		Jul 22, 2025	Details
OpenRouter	poolside/laguna-xs.2	8K	4K		200 req/day (free tier)	—	Details
OpenRouter	poolside/laguna-m.1	8K	4K		200 req/day (free tier)	—	Details
OpenRouter	Gemma 4 26B A4B IT	262K	33K	visionreasoning	200 req/day (free tier)	Apr 3, 2026	Details
Google Gemini	Gemini 3.1 Flash Lite	1.0M	66K	visionaudioreasoning		May 7, 2026	Details
Google Gemini	Gemini 3.1 Flash Lite	1.0M	66K	visionaudioreasoning		May 7, 2026	Details
OpenRouter	qwen/qwen3-coder	8K	4K		200 req/day (free tier)	Jul 23, 2025	Details
OpenRouter	meta-llama/llama-3.3-70b-instruct	8K	4K		200 req/day (free tier)	Dec 6, 2024	Details
OpenRouter	meta-llama/llama-3.2-3b-instruct	8K	4K		200 req/day (free tier)	Sep 25, 2024	Details
OpenRouter	nousresearch/hermes-3-llama-3.1-405b	8K	4K		200 req/day (free tier)	—	Details
Google Gemini	gemini-robotics-er-1.6-preview	8K	4K			—	Details

See our FAQ for common questions about free LLM APIs