Free LLM API Models — Browse & Filter 138+ Models
Showing 138 of 138 models
| Provider | Model | Context | Max Output | Modality | Rate Limit | Status | |
|---|---|---|---|---|---|---|---|
| OpenRouter | Owl Alpha | 1.0M | 262K | See provider page | Details | ||
| OpenRouter | NVIDIA: Nemotron 3 Nano Omni (free) | 256K | 66K | See provider page | Details | ||
| OpenRouter | Poolside: Laguna XS.2 (free) | 131K | 8K | See provider page | Details | ||
| OpenRouter | Poolside: Laguna M.1 (free) | 131K | 8K | See provider page | Details | ||
| OpenRouter | inclusionAI: Ling-2.6-1T (free) | 262K | 33K | See provider page | Details | ||
| OpenRouter | Tencent: Hy3 preview (free) | 262K | 262K | See provider page | Details | ||
| OpenRouter | Baidu: Qianfan-OCR-Fast (free) | 66K | 29K | See provider page | Details | ||
| OpenRouter | Google: Gemma 4 26B A4B (free) | 262K | 33K | See provider page | Details | ||
| OpenRouter | Google: Gemma 4 31B (free) | 262K | 33K | See provider page | Details | ||
| OpenRouter | Google: Lyria 3 Pro Preview | 1.0M | 66K | See provider page | Details | ||
| OpenRouter | Google: Lyria 3 Clip Preview | 1.0M | 66K | See provider page | Details | ||
| OpenRouter | NVIDIA: Nemotron 3 Super (free) | 262K | 262K | See provider page | Details | ||
| OpenRouter | MiniMax: MiniMax M2.5 (free) | 197K | 8K | See provider page | Details | ||
| OpenRouter | Free Models Router | 200K | 8K | See provider page | Details | ||
| OpenRouter | LiquidAI: LFM2.5-1.2B-Thinking (free) | 33K | 8K | See provider page | Details | ||
| OpenRouter | LiquidAI: LFM2.5-1.2B-Instruct (free) | 33K | 8K | See provider page | Details | ||
| OpenRouter | NVIDIA: Nemotron 3 Nano 30B A3B (free) | 256K | 8K | See provider page | Details | ||
| OpenRouter | NVIDIA: Nemotron Nano 12B 2 VL (free) | 128K | 128K | See provider page | Details | ||
| OpenRouter | Qwen: Qwen3 Next 80B A3B Instruct (free) | 262K | 8K | See provider page | Details | ||
| OpenRouter | NVIDIA: Nemotron Nano 9B V2 (free) | 128K | 8K | See provider page | Details | ||
| OpenRouter | OpenAI: gpt-oss-120b (free) | 131K | 131K | See provider page | Details | ||
| OpenRouter | OpenAI: gpt-oss-20b (free) | 131K | 8K | See provider page | Details | ||
| OpenRouter | Z.ai: GLM 4.5 Air (free) | 131K | 96K | See provider page | Details | ||
| OpenRouter | Qwen: Qwen3 Coder 480B A35B (free) | 262K | 262K | See provider page | Details | ||
| OpenRouter | Venice: Uncensored (free) | 33K | 8K | See provider page | Details | ||
| OpenRouter | Google: Gemma 3n 2B (free) | 8K | 2K | See provider page | Details | ||
| OpenRouter | Google: Gemma 3n 4B (free) | 8K | 2K | See provider page | Details | ||
| OpenRouter | Google: Gemma 3 4B (free) | 33K | 8K | See provider page | Details | ||
| OpenRouter | Google: Gemma 3 12B (free) | 33K | 8K | See provider page | Details | ||
| OpenRouter | Google: Gemma 3 27B (free) | 131K | 8K | See provider page | Details | ||
| OpenRouter | Meta: Llama 3.3 70B Instruct (free) | 66K | 8K | See provider page | Details | ||
| OpenRouter | Meta: Llama 3.2 3B Instruct (free) | 131K | 8K | See provider page | Details | ||
| OpenRouter | Nous: Hermes 3 405B Instruct (free) | 131K | 8K | See provider page | Details | ||
| NVIDIA NIM | Various open models | 131K | 8K | See provider page | Details | ||
| Mistral (La Plateforme) | Open and Proprietary Mistral models | 256K | 8K | See provider page | Details | ||
| Cohere | Command A (111B) | 256K | 4K | 20 RPM | Details | ||
| Cohere | Command R+ | 128K | 4K | 20 RPM | Details | ||
| Cohere | Command R7B | 128K | 4K | 20 RPM | Details | ||
| Cohere | Embed 4 | 131K | 131K | 2,000 inputs/min | Details | ||
| Cohere | Rerank 3.5 | 131K | 131K | 10 RPM | Details | ||
| Google Gemini | Gemini 2.5 Flash | 1.0M | 65K | 10 RPM, 250 RPD | Details | ||
| Google Gemini | Gemini 2.5 Flash-Lite | 1.0M | 65K | 15 RPM, 1,000 RPD | Details | ||
| Mistral AI | Mistral Small 4 | 256K | 256K | ~1 RPS, 500K TPM | Details | ||
| Mistral AI | Mistral Medium 3 | 128K | 128K | ~1 RPS, 500K TPM | Details | ||
| Mistral AI | Mistral Large 3 | 256K | 256K | ~1 RPS, 500K TPM | Details | ||
| Mistral AI | Mistral Nemo (12B) | 128K | 128K | ~1 RPS, 500K TPM | Details | ||
| Mistral AI | Codestral | 256K | 256K | ~1 RPS, 500K TPM | Details | ||
| Mistral AI | Pixtral Large | 128K | 128K | ~1 RPS, 500K TPM | Details | ||
| Z AI (Zhipu AI) | GLM-4.7-Flash | 200K | 128K | 1 concurrent request | Details | ||
| Z AI (Zhipu AI) | GLM-4.5-Flash | 128K | 8K | 1 concurrent request | Details | ||
| Z AI (Zhipu AI) | GLM-4.6V-Flash | 128K | 4K | 1 concurrent request | Details | ||
| Cerebras | llama3.1-8b | 128K | 8K | 30 RPM, 14,400 RPD, 1M TPD | Details | ||
| Cerebras | gpt-oss-120b | 128K | 8K | 30 RPM, 14,400 RPD, 1M TPD | Details | ||
| Cerebras | qwen-3-235b-a22b-instruct-2507 | 131K | 8K | 30 RPM, 14,400 RPD, 1M TPD | Details | ||
| Cerebras | zai-glm-4.7 | 128K | 8K | 10 RPM, 100 RPD, 1M TPD | Details | ||
| Cloudflare Workers AI | @cf/meta/llama-3.3-70b-instruct-fp8-fast | 131K | 131K | 10K neurons/day (shared) | Details | ||
| Cloudflare Workers AI | @cf/meta/llama-3.1-8b-instruct-fp8-fast | 131K | 131K | 10K neurons/day (shared) | Details | ||
| Cloudflare Workers AI | @cf/meta/llama-3.2-11b-vision-instruct | 131K | 131K | 10K neurons/day (shared) | Details | ||
| Cloudflare Workers AI | @cf/meta/llama-4-scout-17b-16e-instruct | 10.0M | 131K | 10K neurons/day (shared) | Details | ||
| Cloudflare Workers AI | @cf/mistralai/mistral-small-3.1-24b-instruct | 128K | 131K | 10K neurons/day (shared) | Details | ||
| Cloudflare Workers AI | @cf/google/gemma-4-26b-a4b-it | 256K | 131K | 10K neurons/day (shared) | Details | ||
| Cloudflare Workers AI | @cf/qwen/qwq-32b | 32K | 131K | 10K neurons/day (shared) | Details | ||
| Cloudflare Workers AI | @cf/deepseek-ai/deepseek-r1-distill-qwen-32b | 32K | 131K | 10K neurons/day (shared) | Details | ||
| Cloudflare Workers AI | + 42 more models | 131K | 131K | 10K neurons/day (shared) | Details | ||
| GitHub Models | gpt-4.1 | 1.0M | 32K | 10 RPM, 50 RPD | Details | ||
| GitHub Models | gpt-4.1-mini | 1.0M | 32K | 15 RPM, 150 RPD | Details | ||
| GitHub Models | gpt-4o | 128K | 16K | 10 RPM, 50 RPD | Details | ||
| GitHub Models | o3-mini | 200K | 100K | 10 RPM, 50 RPD | Details | ||
| GitHub Models | o4-mini | 200K | 100K | 10 RPM, 50 RPD | Details | ||
| GitHub Models | Llama-4-Scout-17B-16E | 512K | 4K | 15 RPM, 150 RPD | Details | ||
| GitHub Models | Llama-4-Maverick-17B-128E | 256K | 4K | 10 RPM, 50 RPD | Details | ||
| GitHub Models | Meta-Llama-3.3-70B | 131K | 4K | 15 RPM, 150 RPD | Details | ||
| GitHub Models | DeepSeek-R1 | 64K | 8K | 15 RPM, 150 RPD | Details | ||
| GitHub Models | Mistral-Small-3.1 | 128K | 4K | 15 RPM, 150 RPD | Details | ||
| GitHub Models | + 35 more models | 131K | 131K | Varies by tier | Details | ||
| Groq | llama-3.3-70b-versatile | 131K | 32K | 30 RPM, 14,400 RPD | Details | ||
| Groq | llama-3.1-8b-instant | 131K | 131K | 30 RPM, 14,400 RPD | Details | ||
| Groq | llama-4-scout-17b-16e-instruct | 131K | 8K | 30 RPM, 14,400 RPD | Details | ||
| Groq | llama-4-maverick-17b-128e-instruct | 131K | 8K | 15 RPM, 500 RPD | Details | ||
| Groq | qwen3-32b | 131K | 131K | 30 RPM, 14,400 RPD | Details | ||
| Groq | kimi-k2-instruct | 262K | 262K | 30 RPM, 14,400 RPD | Details | ||
| Groq | deepseek-r1-distill-70b | 131K | 8K | 30 RPM, 14,400 RPD | Details | ||
| Groq | whisper-large-v3 | 131K | 131K | 20 RPM, 2,000 RPD | Details | ||
| Groq | whisper-large-v3-turbo | 131K | 131K | 20 RPM, 2,000 RPD | Details | ||
| Hugging Face | Meta-Llama-3.1-8B-Instruct | 128K | 4K | ~1,000 RPD | Details | ||
| Hugging Face | Mistral-7B-Instruct-v0.3 | 32K | 4K | ~1,000 RPD | Details | ||
| Hugging Face | Mixtral-8x7B-Instruct-v0.1 | 32K | 4K | ~1,000 RPD | Details | ||
| Hugging Face | Phi-3.5-mini-instruct | 128K | 4K | ~1,000 RPD | Details | ||
| Hugging Face | Qwen2.5-7B-Instruct | 131K | 4K | ~1,000 RPD | Details | ||
| Hugging Face | + thousands of community models | 131K | 131K | ~$0.10/month free credits | Details | ||
| Kilo Code | bytedance-seed/dola-seed-2.0-pro:free | 131K | 131K | ~200 req/hr | Details | ||
| Kilo Code | x-ai/grok-code-fast-1:optimized:free | 131K | 131K | ~200 req/hr | Details | ||
| Kilo Code | nvidia/nemotron-3-super-120b-a12b:free | 262K | 32K | ~200 req/hr | Details | ||
| Kilo Code | arcee-ai/trinity-large-thinking:free | 131K | 131K | ~200 req/hr | Details | ||
| Kilo Code | openrouter/free | 131K | 131K | ~200 req/hr | Details | ||
| LLM7.io | deepseek-r1-0528 | 131K | 131K | 30 RPM (120 with token) | Details | ||
| LLM7.io | deepseek-v3-0324 | 131K | 131K | 30 RPM (120 with token) | Details | ||
| LLM7.io | gpt-4o-mini | 131K | 131K | 30 RPM (120 with token) | Details | ||
| LLM7.io | mistral-small-3.1-24b | 32K | 131K | 30 RPM (120 with token) | Details | ||
| LLM7.io | qwen2.5-coder-32b | 131K | 131K | 30 RPM (120 with token) | Details | ||
| LLM7.io | + ~24 more models | 131K | 131K | 30 RPM (120 with token) | Details | ||
| ModelScope | Qwen/Qwen3.5-35B-A3B | 131K | 131K | 2,000 RPD total; <=500 RPD/model (dynamic) | Details | ||
| ModelScope | Qwen/Qwen3.5-27B | 131K | 131K | 2,000 RPD total; <=500 RPD/model (dynamic) | Details | ||
| ModelScope | Qwen/Qwen-Image | 131K | 131K | 2,000 RPD total; model/AIGC-specific caps | Details | ||
| ModelScope | + API-Inference-enabled models | 131K | 131K | Dynamic quotas + dynamic concurrency | Details | ||
| NVIDIA NIM | deepseek-ai/deepseek-r1 | 128K | 163K | ~40 RPM | Details | ||
| NVIDIA NIM | nvidia/llama-3.1-nemotron-ultra-253b-v1 | 128K | 4K | ~40 RPM | Details | ||
| NVIDIA NIM | nvidia/nemotron-3-super-120b-a12b | 262K | 262K | ~40 RPM | Details | ||
| NVIDIA NIM | nvidia/nemotron-3-nano-30b-a3b | 128K | 32K | ~40 RPM | Details | ||
| NVIDIA NIM | meta/llama-3.1-405b-instruct | 128K | 4K | ~40 RPM | Details | ||
| NVIDIA NIM | qwen/qwen2.5-72b-instruct | 128K | 8K | ~40 RPM | Details | ||
| NVIDIA NIM | google/gemma-4-31b | 128K | 8K | ~40 RPM | Details | ||
| NVIDIA NIM | mistralai/mistral-large-2-instruct | 128K | 4K | ~40 RPM | Details | ||
| NVIDIA NIM | nvidia/nemotron-nano-2-vl | 128K | 8K | ~40 RPM | Details | ||
| NVIDIA NIM | minimax/minimax-m2.7 | 128K | 8K | ~40 RPM | Details | ||
| NVIDIA NIM | + 90 more models | 131K | 131K | ~40 RPM | Details | ||
| Ollama Cloud | llama3.1:cloud | 128K | 131K | Session/weekly limits (unpublished) | Details | ||
| Ollama Cloud | deepseek-r1:cloud | 128K | 131K | Session/weekly limits (unpublished) | Details | ||
| Ollama Cloud | qwen2.5:cloud | 128K | 131K | Session/weekly limits (unpublished) | Details | ||
| Ollama Cloud | gemma2:cloud | 8K | 131K | Session/weekly limits (unpublished) | Details | ||
| Ollama Cloud | mistral:cloud | 32K | 131K | Session/weekly limits (unpublished) | Details | ||
| Ollama Cloud | + 400 more models | 131K | 131K | Session/weekly limits (unpublished) | Details | ||
| OVHcloud AI Endpoints | Meta-Llama-3_3-70B-Instruct | 131K | 4K | 2 RPM (anonymous) | Details | ||
| OVHcloud AI Endpoints | DeepSeek-R1-Distill-Llama-70B | 131K | 32K | 2 RPM (anonymous) | Details | ||
| OVHcloud AI Endpoints | Qwen3-Coder-30B-A3B-Instruct | 262K | 32K | 2 RPM (anonymous) | Details | ||
| OVHcloud AI Endpoints | Qwen2.5-VL-72B-Instruct | 128K | 8K | 2 RPM (anonymous) | Details | ||
| OVHcloud AI Endpoints | Mistral-Nemo-Instruct-2407 | 128K | 4K | 2 RPM (anonymous) | Details | ||
| OVHcloud AI Endpoints | Qwen3Guard-Gen-8B | 32K | 4K | 2 RPM (anonymous) | Details | ||
| OVHcloud AI Endpoints | Qwen3Guard-Gen-0.6B | 32K | 4K | 2 RPM (anonymous) | Details | ||
| OVHcloud AI Endpoints | + 30 more models | 131K | 131K | 2 RPM (anonymous) | Details | ||
| SiliconFlow | Qwen/Qwen3-8B | 131K | 131K | 1,000 RPM, 50K TPM | Details | ||
| SiliconFlow | deepseek-ai/DeepSeek-R1-0528-Qwen3-8B | 33K | 16K | 1,000 RPM, 50K TPM | Details | ||
| SiliconFlow | deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | 131K | 131K | 1,000 RPM, 50K TPM | Details | ||
| SiliconFlow | THUDM/glm-4-9b-chat | 32K | 32K | 1,000 RPM, 50K TPM | Details | ||
| SiliconFlow | THUDM/GLM-4.1V-9B-Thinking | 66K | 66K | 1,000 RPM, 50K TPM | Details | ||
| SiliconFlow | deepseek-ai/DeepSeek-OCR | 131K | 8K | 1,000 RPM, 50K TPM | Details | ||
| SiliconFlow | + embedding/speech models | 131K | 131K | 1,000 RPM, 50K TPM | Details | ||
| SiliconFlow | Abbreviation | 131K | 8K | See provider page | Details |