Groq
FlowKit supports Groq's ultra-fast inference via the GroqAdapter.
Setup
bash
export GROQ_API_KEY=gsk_...Get your API key from Groq Console.
Usage
typescript
import { GroqAdapter } from "@andresaya/flowkit";
const adapter = new GroqAdapter({
apiKey: process.env.GROQ_API_KEY!,
model: "llama-3.3-70b-versatile",
});
const engine = new FlowEngine(flow, { llm: adapter, storage });Configuration
typescript
interface GroqConfig {
/** Groq API key (required) */
apiKey: string;
/** Model to use (default: llama-3.3-70b-versatile) */
model?: string;
/** Temperature (default: 0) */
temperature?: number;
/** Max tokens (default: 4096) */
maxTokens?: number;
/** Timeout in ms (default: 60000) */
timeout?: number;
/** Enable streaming (default: false) */
streaming?: boolean;
}Available Models
| Model | Best For | Speed |
|---|---|---|
llama-3.3-70b-versatile | Best quality (recommended) | Fast |
llama-3.1-8b-instant | Ultra fast | Fastest |
mixtral-8x7b-32768 | Good balance | Fast |
gemma2-9b-it | Compact and fast | Fast |
Example
typescript
import {
agent, flow, FlowEngine, MemoryStorage, GroqAdapter,
name, yesNo
} from "@andresaya/flowkit";
const bot = agent("Alex")
.personality("friendly")
.build();
const myFlow = flow("greeting", bot)
.ask("name", "What's your name?", name(), "user_name")
.then("confirm")
.ask("confirm", "Nice to meet you {{user_name}}! Need help?", yesNo(), "needs_help")
.when({ yes: "help", no: "bye" })
.say("help", "How can I help?")
.done()
.say("bye", "Goodbye!")
.done()
.build();
const engine = new FlowEngine(myFlow, {
llm: new GroqAdapter({
apiKey: process.env.GROQ_API_KEY!,
model: "llama-3.3-70b-versatile",
}),
storage: new MemoryStorage(),
});
const result = await engine.start("session-1");
console.log(result.message);Features
| Feature | Supported |
|---|---|
| Streaming | Yes |
| Tool Calling | Yes |
| JSON Mode | Yes |
| Free Tier | Yes |
Why Groq?
- Ultra-fast inference - 100+ tokens/second
- Generous free tier - Great for development
- Open source models - Llama, Mixtral, Gemma
Tips
- Use for high-volume applications - Extremely fast
- Great for development - Generous free tier
- Consider for production - Consistent low latency
- Check rate limits - Free tier has request limits