Groq¶

Autobot supports Groq as an LLM provider via its OpenAI-compatible API. Groq is known for ultra-fast inference powered by custom LPU hardware, offering models like Llama, Mixtral, and Gemma with extremely low latency.

Setup¶

1. Get an API key¶

Create an API key at console.groq.com.

2. Configure credentials¶

Add your API key to the .env file:

GROQ_API_KEY=gsk_...

Or use the interactive setup:

autobot setup
# Select "Groq" as provider

3. Configure the provider¶

In config.yml:

agents:
  defaults:
    model: "groq/llama-3.3-70b-versatile"

providers:
  groq:
    api_key: "${GROQ_API_KEY}"

4. Verify¶

autobot doctor
# Should show: ✓ LLM provider configured (groq)

Model naming¶

Models use the groq/ prefix followed by the Groq model ID:

# Llama models
model: "groq/llama-3.3-70b-versatile"
model: "groq/llama-3.1-8b-instant"

# Mixtral
model: "groq/mixtral-8x7b-32768"

# Gemma
model: "groq/gemma2-9b-it"

The groq/ prefix tells autobot to route to the Groq API. It is stripped before sending to the API.

See the full model list in the Groq docs.

Configuration reference¶

Field	Required	Default	Description
`api_key`	Yes	—	Groq API key (`gsk_...`)
`api_base`	No	`https://api.groq.com/openai/v1/chat/completions`	Custom API endpoint
`extra_headers`	No	—	Additional HTTP headers for every request

How it works¶

Groq uses the OpenAI-compatible Chat Completions API format:

Authorization: Bearer header for authentication
Standard message format with role and content fields
Function calling via tools array

Autobot detects Groq models by the groq keyword and routes to the correct endpoint automatically. Tools, MCP servers, plugins, and all other features work the same as with other providers.

Voice transcription¶

Groq provides the Whisper API for voice transcription. When Groq is configured, voice messages are automatically transcribed using whisper-large-v3-turbo. No extra configuration is needed — the API key is reused from the provider config.

Groq is the preferred transcription provider when available (faster than OpenAI, free tier included). If both Groq and OpenAI are configured, Groq takes priority for transcription.

Known limitations¶

No streaming — Responses are returned in full after the model finishes generating.
Tool choice is always auto — There is no configuration to force a specific tool or disable tool use per-request.
Rate limits — Groq's free tier has token and request limits. Check console.groq.com for current limits.

Troubleshooting¶

Enable debug logging to see request/response details:

LOG_LEVEL=DEBUG autobot agent -m "Hello"

Look for:

POST https://api.groq.com/openai/v1/chat/completions model=... — confirms provider is active
Response 200 (N bytes) — confirms API response
HTTP 4xx/5xx: ... — API errors with details

Common issues¶

"No LLM provider configured" — Check that api_key is set and non-empty in config.yml.

"API error: Invalid API Key" — Invalid or revoked API key. Verify at console.groq.com.

"API error: Rate limit reached" — Too many requests or tokens per minute. Groq's free tier has lower limits — consider upgrading or spacing out requests.

"API error: model_not_found" — Model ID is wrong or has been deprecated. Check the Groq models page for current availability.