Skip to content

Groq

FlowKit supports Groq's ultra-fast inference via the GroqAdapter.

Setup

bash
export GROQ_API_KEY=gsk_...

Get your API key from Groq Console.

Usage

typescript
import { GroqAdapter } from "@andresaya/flowkit";

const adapter = new GroqAdapter({
  apiKey: process.env.GROQ_API_KEY!,
  model: "llama-3.3-70b-versatile",
});

const engine = new FlowEngine(flow, { llm: adapter, storage });

Configuration

typescript
interface GroqConfig {
  /** Groq API key (required) */
  apiKey: string;
  /** Model to use (default: llama-3.3-70b-versatile) */
  model?: string;
  /** Temperature (default: 0) */
  temperature?: number;
  /** Max tokens (default: 4096) */
  maxTokens?: number;
  /** Timeout in ms (default: 60000) */
  timeout?: number;
  /** Enable streaming (default: false) */
  streaming?: boolean;
}

Available Models

ModelBest ForSpeed
llama-3.3-70b-versatileBest quality (recommended)Fast
llama-3.1-8b-instantUltra fastFastest
mixtral-8x7b-32768Good balanceFast
gemma2-9b-itCompact and fastFast

Example

typescript
import { 
  agent, flow, FlowEngine, MemoryStorage, GroqAdapter, 
  name, yesNo 
} from "@andresaya/flowkit";

const bot = agent("Alex")
  .personality("friendly")
  .build();

const myFlow = flow("greeting", bot)
  .ask("name", "What's your name?", name(), "user_name")
  .then("confirm")
  .ask("confirm", "Nice to meet you {{user_name}}! Need help?", yesNo(), "needs_help")
  .when({ yes: "help", no: "bye" })
  .say("help", "How can I help?")
  .done()
  .say("bye", "Goodbye!")
  .done()
  .build();

const engine = new FlowEngine(myFlow, {
  llm: new GroqAdapter({
    apiKey: process.env.GROQ_API_KEY!,
    model: "llama-3.3-70b-versatile",
  }),
  storage: new MemoryStorage(),
});

const result = await engine.start("session-1");
console.log(result.message);

Features

FeatureSupported
StreamingYes
Tool CallingYes
JSON ModeYes
Free TierYes

Why Groq?

  • Ultra-fast inference - 100+ tokens/second
  • Generous free tier - Great for development
  • Open source models - Llama, Mixtral, Gemma

Tips

  1. Use for high-volume applications - Extremely fast
  2. Great for development - Generous free tier
  3. Consider for production - Consistent low latency
  4. Check rate limits - Free tier has request limits

Released under the MIT License.