LLM Providers
FlowKit supports multiple LLM providers through adapters. Each adapter implements the LLMAdapter interface.
Supported Providers
| Provider | Local | Streaming | Tool Calling | Cost |
|---|---|---|---|---|
| Ollama | Yes | No | Yes | Free |
| OpenAI | No | Yes | Yes | Paid |
| OpenRouter | No | Yes | Yes | Free/Paid |
| Anthropic (Claude) | No | Yes | Yes | Paid |
| Google Gemini | No | Yes | Yes | Free/Paid |
| Groq | No | Yes | Yes | Free* |
Notes:
- Ollama Tool Calling: Requires models that support tools (Llama 3.1+, Qwen 3, Mistral Nemo, Command-R+, etc.)
- OpenRouter Free/Paid: OpenRouter offers many free models (DeepSeek, Gemma, some Llama variants) with rate limits, plus paid models.
- Gemini Free/Paid: Generous free tier with rate limits, paid for higher usage.
- Groq Free*: Free tier with rate limits (ultra-fast inference).
Ollama (Local)
Run LLMs locally with Ollama.
Setup
bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a model
ollama pull qwen3:4b
ollama pull llama3.2Usage
typescript
import { OllamaAdapter } from "@andresaya/flowkit";
const adapter = new OllamaAdapter({
model: "qwen3:4b",
baseUrl: "http://localhost:11434", // Optional, this is default
});
const engine = new FlowEngine(flow, { llm: adapter, storage });Configuration
typescript
interface OllamaConfig {
model: string; // Model name (required)
baseUrl?: string; // Ollama base URL
temperature?: number; // 0-1, default varies by model
timeout?: number; // Request timeout in ms
}Recommended Models
| Model | Size | Best For |
|---|---|---|
qwen3:4b | 4B | Best balance of speed and JSON extraction |
qwen3:8b | 8B | Better reasoning |
llama3.2 | 3B | Fast, but weaker extraction |
llama3.1:8b | 8B | Good general purpose |
Tip: For strict mode with JSON extraction, qwen3:4b performs better than llama3.2.
OpenAI
Use OpenAI's GPT models with streaming and tool calling.
Setup
bash
export OPENAI_API_KEY=sk-...Usage
typescript
import { OpenAIAdapter } from "@andresaya/flowkit";
const adapter = new OpenAIAdapter({
apiKey: process.env.OPENAI_API_KEY!,
model: "gpt-4o-mini",
});Configuration
typescript
interface OpenAIConfig {
apiKey: string; // API key (required)
model?: string; // Model name, default "gpt-4o-mini"
baseUrl?: string; // Custom API URL
temperature?: number; // 0-2, default 0
timeout?: number; // Request timeout in ms
streaming?: boolean; // Enable streaming
}Available Models
| Model | Context | Best For |
|---|---|---|
gpt-4o-mini | 128K | Fast, cheap, good quality |
gpt-4o | 128K | Best quality |
gpt-5.2 | 400K | Complex tasks, coding, agentic workflows |
gpt-5-mini | 400K | Faster, cost-efficient, well-defined tasks |
gpt-5-nano | 400K | Fastest, most cost-efficient, basic tasks |
Anthropic (Claude)
Use Anthropic's Claude models - excellent for reasoning and safety.
Setup
bash
export ANTHROPIC_API_KEY=sk-ant-...Usage
typescript
import { AnthropicAdapter } from "@andresaya/flowkit";
const adapter = new AnthropicAdapter({
apiKey: process.env.ANTHROPIC_API_KEY!,
model: "claude-3-5-sonnet-20241022",
});Configuration
typescript
interface AnthropicConfig {
apiKey: string; // API key (required)
model?: string; // Default: "claude-3-5-sonnet-20241022"
maxTokens?: number; // Default: 4096
temperature?: number; // 0-1, default: 0
timeout?: number; // Request timeout in ms
baseUrl?: string; // Custom API URL
streaming?: boolean; // Enable streaming
}Available Models
| Model | Best For |
|---|---|
claude-3-5-sonnet-20241022 | Best balance (recommended) |
claude-3-5-haiku-20241022 | Fast and cheap |
claude-3-opus-20240229 | Most capable |
Google Gemini
Use Google's Gemini models with generous free tier.
Setup
bash
export GOOGLE_AI_API_KEY=...Usage
typescript
import { GeminiAdapter } from "@andresaya/flowkit";
const adapter = new GeminiAdapter({
apiKey: process.env.GOOGLE_AI_API_KEY!,
model: "gemini-1.5-flash",
});Configuration
typescript
interface GeminiConfig {
apiKey: string; // API key (required)
model?: string; // Default: "gemini-1.5-flash"
temperature?: number; // 0-1, default: 0
maxOutputTokens?: number; // Default: 4096
timeout?: number; // Request timeout in ms
baseUrl?: string; // Custom API URL
streaming?: boolean; // Enable streaming
}Available Models
| Model | Best For |
|---|---|
gemini-1.5-flash | Fast and free tier friendly |
gemini-1.5-pro | More capable |
gemini-2.0-flash-exp | Latest experimental |
Groq
Ultra-fast inference with Groq's LPU technology. Free tier available!
Setup
bash
export GROQ_API_KEY=gsk_...Usage
typescript
import { GroqAdapter } from "@andresaya/flowkit";
const adapter = new GroqAdapter({
apiKey: process.env.GROQ_API_KEY!,
model: "llama-3.3-70b-versatile",
});Configuration
typescript
interface GroqConfig {
apiKey: string; // API key (required)
model?: string; // Default: "llama-3.3-70b-versatile"
temperature?: number; // 0-2, default: 0
maxTokens?: number; // Default: 4096
timeout?: number; // Request timeout in ms
baseUrl?: string; // Custom API URL
streaming?: boolean; // Enable streaming
}Available Models
| Model | Best For |
|---|---|
llama-3.3-70b-versatile | Best quality (recommended) |
llama-3.1-8b-instant | Ultra fast |
mixtral-8x7b-32768 | Good balance |
gemma2-9b-it | Compact and fast |
Note: Groq offers extremely fast inference (100+ tokens/sec) with a generous free tier!
OpenRouter
Access 100+ models through a single API with OpenRouter.
Setup
bash
# Set API key
export OPENROUTER_API_KEY=sk-or-...Usage
typescript
import { OpenRouterAdapter } from "@andresaya/flowkit";
const adapter = new OpenRouterAdapter({
apiKey: process.env.OPENROUTER_API_KEY!,
model: "anthropic/claude-3-haiku",
});
const engine = new FlowEngine(flow, { llm: adapter, storage });Configuration
typescript
interface OpenRouterConfig {
apiKey: string; // API key (required)
model?: string; // Model ID, default "openai/gpt-4o-mini"
appName?: string; // Your app name for rankings
siteUrl?: string; // Your site URL for rankings
temperature?: number; // Model-specific range
timeout?: number; // Request timeout in ms
streaming?: boolean; // Enable streaming
}Popular Models
| Model ID | Provider | Cost |
|---|---|---|
openai/gpt-4o-mini | OpenAI | $ |
openai/gpt-4o | OpenAI | $$$ |
anthropic/claude-3-haiku | Anthropic | $ |
anthropic/claude-3-sonnet | Anthropic | $$ |
anthropic/claude-3-opus | Anthropic | $$$ |
google/gemini-pro | $ | |
meta-llama/llama-3-70b-instruct | Meta | $$ |
mistralai/mistral-large | Mistral | $$ |
See OpenRouter Models for full list.
Creating Custom Adapters
Implement the LLMAdapter interface:
typescript
import type { LLMAdapter, LLMResponse, JsonValue, StreamHandler, ToolDefinition } from "@andresaya/flowkit";
class CustomAdapter implements LLMAdapter {
readonly supportsStreaming = true;
readonly supportsTools = true;
async chat(args: {
systemPrompt: string;
messages: Array<{ role: "user" | "assistant"; content: string }>;
userMessage: string;
responseSchema: JsonValue;
tools?: ToolDefinition[];
}): Promise<LLMResponse> {
// Call your provider and return a structured response
return {
understood: true,
extracted: null,
confidence: 1,
response: "OK",
nextStep: null,
actions: [],
};
}
async chatStream(
args: {
systemPrompt: string;
messages: Array<{ role: "user" | "assistant"; content: string }>;
userMessage: string;
responseSchema: JsonValue;
tools?: ToolDefinition[];
},
onChunk: StreamHandler
): Promise<LLMResponse> {
onChunk({ type: "text", content: "Hello" });
onChunk({ type: "done" });
return {
understood: true,
extracted: null,
confidence: 1,
response: "Hello",
nextStep: null,
actions: [],
};
}
}Environment Variables
bash
# .env file
# OpenAI
OPENAI_API_KEY=sk-...
# OpenRouter
OPENROUTER_API_KEY=sk-or-...
# Ollama (optional, defaults to localhost)
OLLAMA_HOST=http://localhost:11434Choosing a Provider
| Use Case | Recommended |
|---|---|
| Development/Testing | Ollama (free, local) |
| Production (budget) | OpenAI gpt-4o-mini |
| Production (quality) | OpenAI gpt-4o or Claude |
| Model experimentation | OpenRouter |
| Privacy-sensitive | Ollama (local) |
| Streaming UX | OpenAI or OpenRouter |