Cloudflare Workers AI logo How to Get a Free Cloudflare Workers AI API Key (2026)

40 free models available — no credit card required. Get your Cloudflare Workers AI API key → Test free models →

💡
Need help setting up Cloudflare Workers AI?
Read our step-by-step tutorial on getting your free API key and 10,000 daily Neurons →

Cloudflare Workers AI FreeLLM Score

✅ 67/100 Solid Choice — Strong in wide model selection How we score →
🎁 Generosity 70 🌍 Access 75 📚 Breadth 85 ⚡ Reliability 70 🔌 Compat 35 🧠 Quality 65

All Free Cloudflare Workers AI Models — Context Windows & Rate Limits

Model Context Max Output Modality Rate Limit Released Status
@cf/google/gemma-4-26b-a4b-it 256K 131K textimagereasoning 10K neurons/day (shared) Apr 2, 2026 Online
@cf/moonshotai/kimi-k2.7-code 262K 131K textcodeimagevideoreasoning 10K neurons/day (shared) Jun 12, 2026 Online
@cf/openai/gpt-oss-120b 128K 131K text 10K neurons/day (shared) Online
@cf/google/gemma-4-26b-a4b-it 8K 4K Apr 3, 2026 Online
@cf/zhipuai/glm-4.7-flash 131K 131K textreasoning 10K neurons/day (shared) Jan 19, 2026 Online
@cf/meta/llama-4-scout-17b-16e-instruct 10.0M 131K text 10K neurons/day (shared) Online
@cf/openai/gpt-oss-120b 8K 4K Online
@cf/nvidia/nemotron-3-120b-a12b 8K 4K Online
@cf/baai/bge-large-en-v1.5 8K 4K Online
@cf/mistralai/mistral-small-3.1-24b-instruct 128K 131K text 10K neurons/day (shared) Mar 17, 2025 Online
@cf/zai-org/glm-4.7-flash 8K 4K Online
@cf/qwen/qwq-32b 8K 4K Online
@cf/meta/llama-3.3-70b-instruct-fp8-fast 131K 131K text 10K neurons/day (shared) Dec 6, 2024 Online
@cf/deepseek-ai/deepseek-r1-distill-qwen-32b 32K 131K textreasoning 10K neurons/day (shared) Jan 20, 2025 Online
@cf/baai/bge-m3 8K 4K Online
@cf/google/gemma-2b-it-lora 8K 4K Online
@cf/moonshotai/kimi-k2.7-code 8K 4K Online
@cf/moonshotai/kimi-k2.6 8K 4K Online
@cf/ibm-granite/granite-4.0-h-micro 8K 4K Online
@cf/baai/bge-small-en-v1.5 8K 4K Online
@cf/zai-org/glm-5.2 8K 4K Online
@cf/baai/bge-base-en-v1.5 8K 4K Online
@cf/aisingapore/gemma-sea-lion-v4-27b-it 8K 4K Online
@cf/openai/gpt-oss-20b 8K 4K Online
@cf/meta/llama-4-scout-17b-16e-instruct 8K 4K Online
@cf/deepseek-ai/deepseek-r1-distill-qwen-32b 8K 4K reasoning Jan 29, 2025 Online
@cf/qwen/qwen2.5-coder-32b-instruct 8K 4K Nov 11, 2024 Online
@cf/mistral/mistral-7b-instruct-v0.2-lora 8K 4K Online
@cf/meta-llama/llama-2-7b-chat-hf-lora 8K 4K Online
@cf/google/gemma-7b-it-lora 8K 4K Online
@cf/mistralai/mistral-small-3.1-24b-instruct 8K 4K Mar 17, 2025 Online
@cf/qwen/qwen3-30b-a3b-fp8 8K 4K Apr 28, 2025 Online
@cf/meta/llama-3.3-70b-instruct-fp8-fast 8K 4K Dec 6, 2024 Online
@cf/meta/llama-3.2-3b-instruct 8K 4K Sep 25, 2024 Online
@cf/meta/llama-3.1-8b-instruct-fp8 8K 4K Jul 23, 2024 Online
@cf/meta/llama-3.2-1b-instruct 8K 4K Sep 25, 2024 Online
@cf/meta/llama-guard-3-8b 8K 4K Online
@cf/qwen/qwen3-embedding-0.6b 8K 4K embedding Online
@cf/pfnet/plamo-embedding-1b 8K 4K embedding Online
@cf/google/embeddinggemma-300m 8K 4K embedding Online

What is Cloudflare Workers AI?

Edge AI inference — 10,000 neurons/day, 50+ models.

Cloudflare Workers AI runs open-weight models directly on Cloudflare's global edge network. The free tier allocates 10,000 Neurons (compute units) per day, supporting 50+ models including Llama, Mistral, Gemma, DeepSeek, and Qwen. Unlike other providers, billing is based on Neurons rather than tokens, making it hard to predict exact request counts. Ideal for low-latency edge deployments.

  • 50+ models on the free tier
  • 10,000 Neurons/day
  • Global edge network for low latency
  • Text, image, audio, and embedding models

API Compatibility: OpenAI SDK-compatible (via REST)

How to Get a Cloudflare Workers AI API Key

  1. 1
    Sign up at dash.cloudflare.com Free account. No credit card.
  2. 2
    Go to Workers & Pages → AI
  3. 3
    Create an API token with Workers AI permissions
  4. 4
    Pick a model Llama 3.2 3B and Mistral 7B are reliable choices.
  5. 5
    Configure OpenAI client Base URL: https://api.cloudflare.com/client/v4/accounts/YOUR_ACCOUNT_ID/ai/run

Cloudflare Workers AI Free Tier Limits & Pricing

Credit Card Not required
Free Tier Permanently free
Context Range 8K – 10.0M
Total Models 40 free
Rate Limits 10K neurons/day (shared)
API Compatibility OpenAI SDK-compatible (via REST)

Cloudflare Workers AI API Setup Tutorial & Tools

Cloudflare Workers AI is fully compatible with popular AI coding assistants like Cursor, Claude Code, and more. To see step-by-step API configuration instructions for your favorite tool, please visit our Global Configuration Guide →

Use Cases

What Cloudflare Workers AI's free models are best for, based on aggregated model capabilities:

Chat 40 models Coding 2 models Reasoning 1 model

Limitations & Caveats

  • Neurons billing is opaque — hard to predict exact request counts
  • Model availability varies by Cloudflare region
  • 10,000 Neurons/day shared across all models

Frequently Asked Questions

How many requests is 10,000 Neurons on Cloudflare Workers AI?

It depends on the model and prompt length. For Llama 3.2 3B with a 500-token prompt, ~10,000 Neurons ≈ 200-400 requests/day. Larger models consume more Neurons per request. Monitor your usage in the Cloudflare dashboard.

Do I need a Cloudflare Workers plan to use Workers AI?

No — the free Workers plan includes 10,000 Neurons/day for AI inference. You don't need to deploy any Workers code; just use the AI API endpoint directly.

Is Cloudflare Workers AI good for production?

The free tier is great for prototyping and low-volume apps. For production, the paid tier offers higher limits and SLAs. The edge network provides low global latency, which is a unique advantage.

See our FAQ for common questions about free LLM APIs