@cf/meta/llama-3.1-8b-instruct-fp8-fast — Free API
Created by Metacloudflare-workers-ai/cf-meta-llama-3-1-8b-instruct-fp8-fast What is @cf/meta/llama-3.1-8b-instruct-fp8-fast?
Llama 3.1 8B Instruct on Cloudflare Workers AI is a lightweight, edge-deployed version of Meta's 8B model optimized with FP8 quantization for fast inference. It is the fastest and most cost-efficient option in Cloudflare's free catalog — ideal for high-volume, low-latency tasks like chat routing, text classification, summarization, or simple Q&A where a 70B+ model would be overkill. Like all Cloudflare Workers AI models, it shares the 10,000 Neurons/day free pool and uses a Cloudflare-specific API format. For developers already on Cloudflare's ecosystem, it integrates directly with Workers and Pages with minimal cold start.
@cf/meta/llama-3.1-8b-instruct-fp8-fast API Code Example
Paste your API key and run. See the config generator for Claude Code, Cursor, and more tools.
Other Free Models from Cloudflare Workers AI
More About Cloudflare Workers AI
How to get an API key, rate limits, platform limitations, and tool configuration — everything you need to set up Cloudflare Workers AI as a free LLM API backend.
View Cloudflare Workers AI full guide →