Best Free LLM APIs for Chat

138 free models available for chat.

Coding Chat Vision Audio Reasoning Embedding

Provider	Model	Context	Max Output	Modality	Rate Limit
OpenRouter	Owl Alpha	1.0M	262K	text	See provider page	Details
OpenRouter	NVIDIA: Nemotron 3 Nano Omni (free)	256K	66K	textimageaudio	See provider page	Details
OpenRouter	Poolside: Laguna XS.2 (free)	131K	8K	text	See provider page	Details
OpenRouter	Poolside: Laguna M.1 (free)	131K	8K	text	See provider page	Details
OpenRouter	inclusionAI: Ling-2.6-1T (free)	262K	33K	text	See provider page	Details
OpenRouter	Tencent: Hy3 preview (free)	262K	262K	text	See provider page	Details
OpenRouter	Baidu: Qianfan-OCR-Fast (free)	66K	29K	textimage	See provider page	Details
OpenRouter	Google: Gemma 4 26B A4B (free)	262K	33K	textimage	See provider page	Details
OpenRouter	Google: Gemma 4 31B (free)	262K	33K	textimage	See provider page	Details
OpenRouter	Google: Lyria 3 Pro Preview	1.0M	66K	textimage	See provider page	Details
OpenRouter	Google: Lyria 3 Clip Preview	1.0M	66K	textimage	See provider page	Details
OpenRouter	NVIDIA: Nemotron 3 Super (free)	262K	262K	text	See provider page	Details
OpenRouter	MiniMax: MiniMax M2.5 (free)	197K	8K	text	See provider page	Details
OpenRouter	Free Models Router	200K	8K	textimage	See provider page	Details
OpenRouter	LiquidAI: LFM2.5-1.2B-Thinking (free)	33K	8K	textreasoning	See provider page	Details
OpenRouter	LiquidAI: LFM2.5-1.2B-Instruct (free)	33K	8K	text	See provider page	Details
OpenRouter	NVIDIA: Nemotron 3 Nano 30B A3B (free)	256K	8K	text	See provider page	Details
OpenRouter	NVIDIA: Nemotron Nano 12B 2 VL (free)	128K	128K	textimage	See provider page	Details
OpenRouter	Qwen: Qwen3 Next 80B A3B Instruct (free)	262K	8K	text	See provider page	Details
OpenRouter	NVIDIA: Nemotron Nano 9B V2 (free)	128K	8K	text	See provider page	Details
OpenRouter	OpenAI: gpt-oss-120b (free)	131K	131K	text	See provider page	Details
OpenRouter	OpenAI: gpt-oss-20b (free)	131K	8K	text	See provider page	Details
OpenRouter	Z.ai: GLM 4.5 Air (free)	131K	96K	text	See provider page	Details
OpenRouter	Qwen: Qwen3 Coder 480B A35B (free)	262K	262K	textcode	See provider page	Details
OpenRouter	Venice: Uncensored (free)	33K	8K	text	See provider page	Details
OpenRouter	Google: Gemma 3n 2B (free)	8K	2K	text	See provider page	Details
OpenRouter	Google: Gemma 3n 4B (free)	8K	2K	text	See provider page	Details
OpenRouter	Google: Gemma 3 4B (free)	33K	8K	textimage	See provider page	Details
OpenRouter	Google: Gemma 3 12B (free)	33K	8K	textimage	See provider page	Details
OpenRouter	Google: Gemma 3 27B (free)	131K	8K	textimage	See provider page	Details
OpenRouter	Meta: Llama 3.3 70B Instruct (free)	66K	8K	text	See provider page	Details
OpenRouter	Meta: Llama 3.2 3B Instruct (free)	131K	8K	text	See provider page	Details
OpenRouter	Nous: Hermes 3 405B Instruct (free)	131K	8K	text	See provider page	Details
NVIDIA NIM	Various open models	131K	8K	text	See provider page	Details
Mistral (La Plateforme)	Open and Proprietary Mistral models	256K	8K	text	See provider page	Details
Cohere	Command A (111B)	256K	4K	text	20 RPM	Details
Cohere	Command R+	128K	4K	text	20 RPM	Details
Cohere	Command R7B	128K	4K	text	20 RPM	Details
Cohere	Embed 4	131K	131K	text	2,000 inputs/min	Details
Cohere	Rerank 3.5	131K	131K	text	10 RPM	Details
Google Gemini	Gemini 2.5 Flash	1.0M	65K	text	10 RPM, 250 RPD	Details
Google Gemini	Gemini 2.5 Flash-Lite	1.0M	65K	text	15 RPM, 1,000 RPD	Details
Mistral AI	Mistral Small 4	256K	256K	text	~1 RPS, 500K TPM	Details
Mistral AI	Mistral Medium 3	128K	128K	text	~1 RPS, 500K TPM	Details
Mistral AI	Mistral Large 3	256K	256K	text	~1 RPS, 500K TPM	Details
Mistral AI	Mistral Nemo (12B)	128K	128K	text	~1 RPS, 500K TPM	Details
Mistral AI	Codestral	256K	256K	textcode	~1 RPS, 500K TPM	Details
Mistral AI	Pixtral Large	128K	128K	textimage	~1 RPS, 500K TPM	Details
Z AI (Zhipu AI)	GLM-4.7-Flash	200K	128K	text	1 concurrent request	Details
Z AI (Zhipu AI)	GLM-4.5-Flash	128K	8K	text	1 concurrent request	Details
Z AI (Zhipu AI)	GLM-4.6V-Flash	128K	4K	text	1 concurrent request	Details
Cerebras	llama3.1-8b	128K	8K	text	30 RPM, 14,400 RPD, 1M TPD	Details
Cerebras	gpt-oss-120b	128K	8K	text	30 RPM, 14,400 RPD, 1M TPD	Details
Cerebras	qwen-3-235b-a22b-instruct-2507	131K	8K	text	30 RPM, 14,400 RPD, 1M TPD	Details
Cerebras	zai-glm-4.7	128K	8K	text	10 RPM, 100 RPD, 1M TPD	Details
Cloudflare Workers AI	@cf/meta/llama-3.3-70b-instruct-fp8-fast	131K	131K	text	10K neurons/day (shared)	Details
Cloudflare Workers AI	@cf/meta/llama-3.1-8b-instruct-fp8-fast	131K	131K	text	10K neurons/day (shared)	Details
Cloudflare Workers AI	@cf/meta/llama-3.2-11b-vision-instruct	131K	131K	textimage	10K neurons/day (shared)	Details
Cloudflare Workers AI	@cf/meta/llama-4-scout-17b-16e-instruct	10.0M	131K	text	10K neurons/day (shared)	Details
Cloudflare Workers AI	@cf/mistralai/mistral-small-3.1-24b-instruct	128K	131K	text	10K neurons/day (shared)	Details
Cloudflare Workers AI	@cf/google/gemma-4-26b-a4b-it	256K	131K	text	10K neurons/day (shared)	Details
Cloudflare Workers AI	@cf/qwen/qwq-32b	32K	131K	text	10K neurons/day (shared)	Details
Cloudflare Workers AI	@cf/deepseek-ai/deepseek-r1-distill-qwen-32b	32K	131K	text	10K neurons/day (shared)	Details
Cloudflare Workers AI	+ 42 more models	131K	131K	text	10K neurons/day (shared)	Details
GitHub Models	gpt-4.1	1.0M	32K	text	10 RPM, 50 RPD	Details
GitHub Models	gpt-4.1-mini	1.0M	32K	text	15 RPM, 150 RPD	Details
GitHub Models	gpt-4o	128K	16K	text	10 RPM, 50 RPD	Details
GitHub Models	o3-mini	200K	100K	text	10 RPM, 50 RPD	Details
GitHub Models	o4-mini	200K	100K	text	10 RPM, 50 RPD	Details
GitHub Models	Llama-4-Scout-17B-16E	512K	4K	text	15 RPM, 150 RPD	Details
GitHub Models	Llama-4-Maverick-17B-128E	256K	4K	text	10 RPM, 50 RPD	Details
GitHub Models	Meta-Llama-3.3-70B	131K	4K	text	15 RPM, 150 RPD	Details
GitHub Models	DeepSeek-R1	64K	8K	text	15 RPM, 150 RPD	Details
GitHub Models	Mistral-Small-3.1	128K	4K	text	15 RPM, 150 RPD	Details
GitHub Models	+ 35 more models	131K	131K	text	Varies by tier	Details
Groq	llama-3.3-70b-versatile	131K	32K	text	30 RPM, 14,400 RPD	Details
Groq	llama-3.1-8b-instant	131K	131K	text	30 RPM, 14,400 RPD	Details
Groq	llama-4-scout-17b-16e-instruct	131K	8K	text	30 RPM, 14,400 RPD	Details
Groq	llama-4-maverick-17b-128e-instruct	131K	8K	text	15 RPM, 500 RPD	Details
Groq	qwen3-32b	131K	131K	text	30 RPM, 14,400 RPD	Details
Groq	kimi-k2-instruct	262K	262K	text	30 RPM, 14,400 RPD	Details
Groq	deepseek-r1-distill-70b	131K	8K	text	30 RPM, 14,400 RPD	Details
Groq	whisper-large-v3	131K	131K	text	20 RPM, 2,000 RPD	Details
Groq	whisper-large-v3-turbo	131K	131K	text	20 RPM, 2,000 RPD	Details
Hugging Face	Meta-Llama-3.1-8B-Instruct	128K	4K	text	~1,000 RPD	Details
Hugging Face	Mistral-7B-Instruct-v0.3	32K	4K	text	~1,000 RPD	Details
Hugging Face	Mixtral-8x7B-Instruct-v0.1	32K	4K	text	~1,000 RPD	Details
Hugging Face	Phi-3.5-mini-instruct	128K	4K	text	~1,000 RPD	Details
Hugging Face	Qwen2.5-7B-Instruct	131K	4K	text	~1,000 RPD	Details
Hugging Face	+ thousands of community models	131K	131K	text	~$0.10/month free credits	Details
Kilo Code	bytedance-seed/dola-seed-2.0-pro:free	131K	131K	text	~200 req/hr	Details
Kilo Code	x-ai/grok-code-fast-1:optimized:free	131K	131K	textcode	~200 req/hr	Details
Kilo Code	nvidia/nemotron-3-super-120b-a12b:free	262K	32K	text	~200 req/hr	Details
Kilo Code	arcee-ai/trinity-large-thinking:free	131K	131K	text	~200 req/hr	Details
Kilo Code	openrouter/free	131K	131K	text	~200 req/hr	Details
LLM7.io	deepseek-r1-0528	131K	131K	text	30 RPM (120 with token)	Details
LLM7.io	deepseek-v3-0324	131K	131K	text	30 RPM (120 with token)	Details
LLM7.io	gpt-4o-mini	131K	131K	text	30 RPM (120 with token)	Details
LLM7.io	mistral-small-3.1-24b	32K	131K	text	30 RPM (120 with token)	Details
LLM7.io	qwen2.5-coder-32b	131K	131K	textcode	30 RPM (120 with token)	Details
LLM7.io	+ ~24 more models	131K	131K	text	30 RPM (120 with token)	Details
ModelScope	Qwen/Qwen3.5-35B-A3B	131K	131K	text	2,000 RPD total; <=500 RPD/model (dynamic)	Details
ModelScope	Qwen/Qwen3.5-27B	131K	131K	text	2,000 RPD total; <=500 RPD/model (dynamic)	Details
ModelScope	Qwen/Qwen-Image	131K	131K	text	2,000 RPD total; model/AIGC-specific caps	Details
ModelScope	+ API-Inference-enabled models	131K	131K	text	Dynamic quotas + dynamic concurrency	Details
NVIDIA NIM	deepseek-ai/deepseek-r1	128K	163K	text	~40 RPM	Details
NVIDIA NIM	nvidia/llama-3.1-nemotron-ultra-253b-v1	128K	4K	text	~40 RPM	Details
NVIDIA NIM	nvidia/nemotron-3-super-120b-a12b	262K	262K	text	~40 RPM	Details
NVIDIA NIM	nvidia/nemotron-3-nano-30b-a3b	128K	32K	text	~40 RPM	Details
NVIDIA NIM	meta/llama-3.1-405b-instruct	128K	4K	text	~40 RPM	Details
NVIDIA NIM	qwen/qwen2.5-72b-instruct	128K	8K	text	~40 RPM	Details
NVIDIA NIM	google/gemma-4-31b	128K	8K	text	~40 RPM	Details
NVIDIA NIM	mistralai/mistral-large-2-instruct	128K	4K	text	~40 RPM	Details
NVIDIA NIM	nvidia/nemotron-nano-2-vl	128K	8K	textimage	~40 RPM	Details
NVIDIA NIM	minimax/minimax-m2.7	128K	8K	text	~40 RPM	Details
NVIDIA NIM	+ 90 more models	131K	131K	text	~40 RPM	Details
Ollama Cloud	llama3.1:cloud	128K	131K	text	Session/weekly limits (unpublished)	Details
Ollama Cloud	deepseek-r1:cloud	128K	131K	text	Session/weekly limits (unpublished)	Details
Ollama Cloud	qwen2.5:cloud	128K	131K	text	Session/weekly limits (unpublished)	Details
Ollama Cloud	gemma2:cloud	8K	131K	text	Session/weekly limits (unpublished)	Details
Ollama Cloud	mistral:cloud	32K	131K	text	Session/weekly limits (unpublished)	Details
Ollama Cloud	+ 400 more models	131K	131K	text	Session/weekly limits (unpublished)	Details
OVHcloud AI Endpoints	Meta-Llama-3_3-70B-Instruct	131K	4K	text	2 RPM (anonymous)	Details
OVHcloud AI Endpoints	DeepSeek-R1-Distill-Llama-70B	131K	32K	text	2 RPM (anonymous)	Details
OVHcloud AI Endpoints	Qwen3-Coder-30B-A3B-Instruct	262K	32K	textcode	2 RPM (anonymous)	Details
OVHcloud AI Endpoints	Qwen2.5-VL-72B-Instruct	128K	8K	textimage	2 RPM (anonymous)	Details
OVHcloud AI Endpoints	Mistral-Nemo-Instruct-2407	128K	4K	text	2 RPM (anonymous)	Details
OVHcloud AI Endpoints	Qwen3Guard-Gen-8B	32K	4K	text	2 RPM (anonymous)	Details
OVHcloud AI Endpoints	Qwen3Guard-Gen-0.6B	32K	4K	text	2 RPM (anonymous)	Details
OVHcloud AI Endpoints	+ 30 more models	131K	131K	text	2 RPM (anonymous)	Details
SiliconFlow	Qwen/Qwen3-8B	131K	131K	text	1,000 RPM, 50K TPM	Details
SiliconFlow	deepseek-ai/DeepSeek-R1-0528-Qwen3-8B	33K	16K	text	1,000 RPM, 50K TPM	Details
SiliconFlow	deepseek-ai/DeepSeek-R1-Distill-Qwen-7B	131K	131K	text	1,000 RPM, 50K TPM	Details
SiliconFlow	THUDM/glm-4-9b-chat	32K	32K	text	1,000 RPM, 50K TPM	Details
SiliconFlow	THUDM/GLM-4.1V-9B-Thinking	66K	66K	text	1,000 RPM, 50K TPM	Details
SiliconFlow	deepseek-ai/DeepSeek-OCR	131K	8K	text	1,000 RPM, 50K TPM	Details
SiliconFlow	+ embedding/speech models	131K	131K	textaudio	1,000 RPM, 50K TPM	Details
SiliconFlow	Abbreviation	131K	8K	text	See provider page	Details