Best Free LLM APIs for Chat

138 free models available for chat.

Provider Model Context Max Output Modality Rate Limit
OpenRouter Owl Alpha 1.0M 262K text See provider page Details
OpenRouter NVIDIA: Nemotron 3 Nano Omni (free) 256K 66K textimageaudio See provider page Details
OpenRouter Poolside: Laguna XS.2 (free) 131K 8K text See provider page Details
OpenRouter Poolside: Laguna M.1 (free) 131K 8K text See provider page Details
OpenRouter inclusionAI: Ling-2.6-1T (free) 262K 33K text See provider page Details
OpenRouter Tencent: Hy3 preview (free) 262K 262K text See provider page Details
OpenRouter Baidu: Qianfan-OCR-Fast (free) 66K 29K textimage See provider page Details
OpenRouter Google: Gemma 4 26B A4B (free) 262K 33K textimage See provider page Details
OpenRouter Google: Gemma 4 31B (free) 262K 33K textimage See provider page Details
OpenRouter Google: Lyria 3 Pro Preview 1.0M 66K textimage See provider page Details
OpenRouter Google: Lyria 3 Clip Preview 1.0M 66K textimage See provider page Details
OpenRouter NVIDIA: Nemotron 3 Super (free) 262K 262K text See provider page Details
OpenRouter MiniMax: MiniMax M2.5 (free) 197K 8K text See provider page Details
OpenRouter Free Models Router 200K 8K textimage See provider page Details
OpenRouter LiquidAI: LFM2.5-1.2B-Thinking (free) 33K 8K textreasoning See provider page Details
OpenRouter LiquidAI: LFM2.5-1.2B-Instruct (free) 33K 8K text See provider page Details
OpenRouter NVIDIA: Nemotron 3 Nano 30B A3B (free) 256K 8K text See provider page Details
OpenRouter NVIDIA: Nemotron Nano 12B 2 VL (free) 128K 128K textimage See provider page Details
OpenRouter Qwen: Qwen3 Next 80B A3B Instruct (free) 262K 8K text See provider page Details
OpenRouter NVIDIA: Nemotron Nano 9B V2 (free) 128K 8K text See provider page Details
OpenRouter OpenAI: gpt-oss-120b (free) 131K 131K text See provider page Details
OpenRouter OpenAI: gpt-oss-20b (free) 131K 8K text See provider page Details
OpenRouter Z.ai: GLM 4.5 Air (free) 131K 96K text See provider page Details
OpenRouter Qwen: Qwen3 Coder 480B A35B (free) 262K 262K textcode See provider page Details
OpenRouter Venice: Uncensored (free) 33K 8K text See provider page Details
OpenRouter Google: Gemma 3n 2B (free) 8K 2K text See provider page Details
OpenRouter Google: Gemma 3n 4B (free) 8K 2K text See provider page Details
OpenRouter Google: Gemma 3 4B (free) 33K 8K textimage See provider page Details
OpenRouter Google: Gemma 3 12B (free) 33K 8K textimage See provider page Details
OpenRouter Google: Gemma 3 27B (free) 131K 8K textimage See provider page Details
OpenRouter Meta: Llama 3.3 70B Instruct (free) 66K 8K text See provider page Details
OpenRouter Meta: Llama 3.2 3B Instruct (free) 131K 8K text See provider page Details
OpenRouter Nous: Hermes 3 405B Instruct (free) 131K 8K text See provider page Details
NVIDIA NIM Various open models 131K 8K text See provider page Details
Mistral (La Plateforme) Open and Proprietary Mistral models 256K 8K text See provider page Details
Cohere Command A (111B) 256K 4K text 20 RPM Details
Cohere Command R+ 128K 4K text 20 RPM Details
Cohere Command R7B 128K 4K text 20 RPM Details
Cohere Embed 4 131K 131K text 2,000 inputs/min Details
Cohere Rerank 3.5 131K 131K text 10 RPM Details
Google Gemini Gemini 2.5 Flash 1.0M 65K text 10 RPM, 250 RPD Details
Google Gemini Gemini 2.5 Flash-Lite 1.0M 65K text 15 RPM, 1,000 RPD Details
Mistral AI Mistral Small 4 256K 256K text ~1 RPS, 500K TPM Details
Mistral AI Mistral Medium 3 128K 128K text ~1 RPS, 500K TPM Details
Mistral AI Mistral Large 3 256K 256K text ~1 RPS, 500K TPM Details
Mistral AI Mistral Nemo (12B) 128K 128K text ~1 RPS, 500K TPM Details
Mistral AI Codestral 256K 256K textcode ~1 RPS, 500K TPM Details
Mistral AI Pixtral Large 128K 128K textimage ~1 RPS, 500K TPM Details
Z AI (Zhipu AI) GLM-4.7-Flash 200K 128K text 1 concurrent request Details
Z AI (Zhipu AI) GLM-4.5-Flash 128K 8K text 1 concurrent request Details
Z AI (Zhipu AI) GLM-4.6V-Flash 128K 4K text 1 concurrent request Details
Cerebras llama3.1-8b 128K 8K text 30 RPM, 14,400 RPD, 1M TPD Details
Cerebras gpt-oss-120b 128K 8K text 30 RPM, 14,400 RPD, 1M TPD Details
Cerebras qwen-3-235b-a22b-instruct-2507 131K 8K text 30 RPM, 14,400 RPD, 1M TPD Details
Cerebras zai-glm-4.7 128K 8K text 10 RPM, 100 RPD, 1M TPD Details
Cloudflare Workers AI @cf/meta/llama-3.3-70b-instruct-fp8-fast 131K 131K text 10K neurons/day (shared) Details
Cloudflare Workers AI @cf/meta/llama-3.1-8b-instruct-fp8-fast 131K 131K text 10K neurons/day (shared) Details
Cloudflare Workers AI @cf/meta/llama-3.2-11b-vision-instruct 131K 131K textimage 10K neurons/day (shared) Details
Cloudflare Workers AI @cf/meta/llama-4-scout-17b-16e-instruct 10.0M 131K text 10K neurons/day (shared) Details
Cloudflare Workers AI @cf/mistralai/mistral-small-3.1-24b-instruct 128K 131K text 10K neurons/day (shared) Details
Cloudflare Workers AI @cf/google/gemma-4-26b-a4b-it 256K 131K text 10K neurons/day (shared) Details
Cloudflare Workers AI @cf/qwen/qwq-32b 32K 131K text 10K neurons/day (shared) Details
Cloudflare Workers AI @cf/deepseek-ai/deepseek-r1-distill-qwen-32b 32K 131K text 10K neurons/day (shared) Details
Cloudflare Workers AI + 42 more models 131K 131K text 10K neurons/day (shared) Details
GitHub Models gpt-4.1 1.0M 32K text 10 RPM, 50 RPD Details
GitHub Models gpt-4.1-mini 1.0M 32K text 15 RPM, 150 RPD Details
GitHub Models gpt-4o 128K 16K text 10 RPM, 50 RPD Details
GitHub Models o3-mini 200K 100K text 10 RPM, 50 RPD Details
GitHub Models o4-mini 200K 100K text 10 RPM, 50 RPD Details
GitHub Models Llama-4-Scout-17B-16E 512K 4K text 15 RPM, 150 RPD Details
GitHub Models Llama-4-Maverick-17B-128E 256K 4K text 10 RPM, 50 RPD Details
GitHub Models Meta-Llama-3.3-70B 131K 4K text 15 RPM, 150 RPD Details
GitHub Models DeepSeek-R1 64K 8K text 15 RPM, 150 RPD Details
GitHub Models Mistral-Small-3.1 128K 4K text 15 RPM, 150 RPD Details
GitHub Models + 35 more models 131K 131K text Varies by tier Details
Groq llama-3.3-70b-versatile 131K 32K text 30 RPM, 14,400 RPD Details
Groq llama-3.1-8b-instant 131K 131K text 30 RPM, 14,400 RPD Details
Groq llama-4-scout-17b-16e-instruct 131K 8K text 30 RPM, 14,400 RPD Details
Groq llama-4-maverick-17b-128e-instruct 131K 8K text 15 RPM, 500 RPD Details
Groq qwen3-32b 131K 131K text 30 RPM, 14,400 RPD Details
Groq kimi-k2-instruct 262K 262K text 30 RPM, 14,400 RPD Details
Groq deepseek-r1-distill-70b 131K 8K text 30 RPM, 14,400 RPD Details
Groq whisper-large-v3 131K 131K text 20 RPM, 2,000 RPD Details
Groq whisper-large-v3-turbo 131K 131K text 20 RPM, 2,000 RPD Details
Hugging Face Meta-Llama-3.1-8B-Instruct 128K 4K text ~1,000 RPD Details
Hugging Face Mistral-7B-Instruct-v0.3 32K 4K text ~1,000 RPD Details
Hugging Face Mixtral-8x7B-Instruct-v0.1 32K 4K text ~1,000 RPD Details
Hugging Face Phi-3.5-mini-instruct 128K 4K text ~1,000 RPD Details
Hugging Face Qwen2.5-7B-Instruct 131K 4K text ~1,000 RPD Details
Hugging Face + thousands of community models 131K 131K text ~$0.10/month free credits Details
Kilo Code bytedance-seed/dola-seed-2.0-pro:free 131K 131K text ~200 req/hr Details
Kilo Code x-ai/grok-code-fast-1:optimized:free 131K 131K textcode ~200 req/hr Details
Kilo Code nvidia/nemotron-3-super-120b-a12b:free 262K 32K text ~200 req/hr Details
Kilo Code arcee-ai/trinity-large-thinking:free 131K 131K text ~200 req/hr Details
Kilo Code openrouter/free 131K 131K text ~200 req/hr Details
LLM7.io deepseek-r1-0528 131K 131K text 30 RPM (120 with token) Details
LLM7.io deepseek-v3-0324 131K 131K text 30 RPM (120 with token) Details
LLM7.io gpt-4o-mini 131K 131K text 30 RPM (120 with token) Details
LLM7.io mistral-small-3.1-24b 32K 131K text 30 RPM (120 with token) Details
LLM7.io qwen2.5-coder-32b 131K 131K textcode 30 RPM (120 with token) Details
LLM7.io + ~24 more models 131K 131K text 30 RPM (120 with token) Details
ModelScope Qwen/Qwen3.5-35B-A3B 131K 131K text 2,000 RPD total; <=500 RPD/model (dynamic) Details
ModelScope Qwen/Qwen3.5-27B 131K 131K text 2,000 RPD total; <=500 RPD/model (dynamic) Details
ModelScope Qwen/Qwen-Image 131K 131K text 2,000 RPD total; model/AIGC-specific caps Details
ModelScope + API-Inference-enabled models 131K 131K text Dynamic quotas + dynamic concurrency Details
NVIDIA NIM deepseek-ai/deepseek-r1 128K 163K text ~40 RPM Details
NVIDIA NIM nvidia/llama-3.1-nemotron-ultra-253b-v1 128K 4K text ~40 RPM Details
NVIDIA NIM nvidia/nemotron-3-super-120b-a12b 262K 262K text ~40 RPM Details
NVIDIA NIM nvidia/nemotron-3-nano-30b-a3b 128K 32K text ~40 RPM Details
NVIDIA NIM meta/llama-3.1-405b-instruct 128K 4K text ~40 RPM Details
NVIDIA NIM qwen/qwen2.5-72b-instruct 128K 8K text ~40 RPM Details
NVIDIA NIM google/gemma-4-31b 128K 8K text ~40 RPM Details
NVIDIA NIM mistralai/mistral-large-2-instruct 128K 4K text ~40 RPM Details
NVIDIA NIM nvidia/nemotron-nano-2-vl 128K 8K textimage ~40 RPM Details
NVIDIA NIM minimax/minimax-m2.7 128K 8K text ~40 RPM Details
NVIDIA NIM + 90 more models 131K 131K text ~40 RPM Details
Ollama Cloud llama3.1:cloud 128K 131K text Session/weekly limits (unpublished) Details
Ollama Cloud deepseek-r1:cloud 128K 131K text Session/weekly limits (unpublished) Details
Ollama Cloud qwen2.5:cloud 128K 131K text Session/weekly limits (unpublished) Details
Ollama Cloud gemma2:cloud 8K 131K text Session/weekly limits (unpublished) Details
Ollama Cloud mistral:cloud 32K 131K text Session/weekly limits (unpublished) Details
Ollama Cloud + 400 more models 131K 131K text Session/weekly limits (unpublished) Details
OVHcloud AI Endpoints Meta-Llama-3_3-70B-Instruct 131K 4K text 2 RPM (anonymous) Details
OVHcloud AI Endpoints DeepSeek-R1-Distill-Llama-70B 131K 32K text 2 RPM (anonymous) Details
OVHcloud AI Endpoints Qwen3-Coder-30B-A3B-Instruct 262K 32K textcode 2 RPM (anonymous) Details
OVHcloud AI Endpoints Qwen2.5-VL-72B-Instruct 128K 8K textimage 2 RPM (anonymous) Details
OVHcloud AI Endpoints Mistral-Nemo-Instruct-2407 128K 4K text 2 RPM (anonymous) Details
OVHcloud AI Endpoints Qwen3Guard-Gen-8B 32K 4K text 2 RPM (anonymous) Details
OVHcloud AI Endpoints Qwen3Guard-Gen-0.6B 32K 4K text 2 RPM (anonymous) Details
OVHcloud AI Endpoints + 30 more models 131K 131K text 2 RPM (anonymous) Details
SiliconFlow Qwen/Qwen3-8B 131K 131K text 1,000 RPM, 50K TPM Details
SiliconFlow deepseek-ai/DeepSeek-R1-0528-Qwen3-8B 33K 16K text 1,000 RPM, 50K TPM Details
SiliconFlow deepseek-ai/DeepSeek-R1-Distill-Qwen-7B 131K 131K text 1,000 RPM, 50K TPM Details
SiliconFlow THUDM/glm-4-9b-chat 32K 32K text 1,000 RPM, 50K TPM Details
SiliconFlow THUDM/GLM-4.1V-9B-Thinking 66K 66K text 1,000 RPM, 50K TPM Details
SiliconFlow deepseek-ai/DeepSeek-OCR 131K 8K text 1,000 RPM, 50K TPM Details
SiliconFlow + embedding/speech models 131K 131K textaudio 1,000 RPM, 50K TPM Details
SiliconFlow Abbreviation 131K 8K text See provider page Details