Skip to content

LLM Providers

FlowKit supports multiple LLM providers through adapters. Each adapter implements the LLMAdapter interface.

Supported Providers

ProviderLocalStreamingTool CallingCost
OllamaYesNoYesFree
OpenAINoYesYesPaid
OpenRouterNoYesYesFree/Paid
Anthropic (Claude)NoYesYesPaid
Google GeminiNoYesYesFree/Paid
GroqNoYesYesFree*

Notes:

  • Ollama Tool Calling: Requires models that support tools (Llama 3.1+, Qwen 3, Mistral Nemo, Command-R+, etc.)
  • OpenRouter Free/Paid: OpenRouter offers many free models (DeepSeek, Gemma, some Llama variants) with rate limits, plus paid models.
  • Gemini Free/Paid: Generous free tier with rate limits, paid for higher usage.
  • Groq Free*: Free tier with rate limits (ultra-fast inference).

Ollama (Local)

Run LLMs locally with Ollama.

Setup

bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model
ollama pull qwen3:4b
ollama pull llama3.2

Usage

typescript
import { OllamaAdapter } from "@andresaya/flowkit";

const adapter = new OllamaAdapter({
  model: "qwen3:4b",
  baseUrl: "http://localhost:11434", // Optional, this is default
});

const engine = new FlowEngine(flow, { llm: adapter, storage });

Configuration

typescript
interface OllamaConfig {
  model: string;           // Model name (required)
  baseUrl?: string;        // Ollama base URL
  temperature?: number;    // 0-1, default varies by model
  timeout?: number;        // Request timeout in ms
}
ModelSizeBest For
qwen3:4b4BBest balance of speed and JSON extraction
qwen3:8b8BBetter reasoning
llama3.23BFast, but weaker extraction
llama3.1:8b8BGood general purpose

Tip: For strict mode with JSON extraction, qwen3:4b performs better than llama3.2.


OpenAI

Use OpenAI's GPT models with streaming and tool calling.

Setup

bash
export OPENAI_API_KEY=sk-...

Usage

typescript
import { OpenAIAdapter } from "@andresaya/flowkit";

const adapter = new OpenAIAdapter({
  apiKey: process.env.OPENAI_API_KEY!,
  model: "gpt-4o-mini",
});

Configuration

typescript
interface OpenAIConfig {
  apiKey: string;          // API key (required)
  model?: string;          // Model name, default "gpt-4o-mini"
  baseUrl?: string;        // Custom API URL
  temperature?: number;    // 0-2, default 0
  timeout?: number;        // Request timeout in ms
  streaming?: boolean;     // Enable streaming
}

Available Models

ModelContextBest For
gpt-4o-mini128KFast, cheap, good quality
gpt-4o128KBest quality
gpt-5.2400KComplex tasks, coding, agentic workflows
gpt-5-mini400KFaster, cost-efficient, well-defined tasks
gpt-5-nano400KFastest, most cost-efficient, basic tasks

Anthropic (Claude)

Use Anthropic's Claude models - excellent for reasoning and safety.

Setup

bash
export ANTHROPIC_API_KEY=sk-ant-...

Usage

typescript
import { AnthropicAdapter } from "@andresaya/flowkit";

const adapter = new AnthropicAdapter({
  apiKey: process.env.ANTHROPIC_API_KEY!,
  model: "claude-3-5-sonnet-20241022",
});

Configuration

typescript
interface AnthropicConfig {
  apiKey: string;          // API key (required)
  model?: string;          // Default: "claude-3-5-sonnet-20241022"
  maxTokens?: number;      // Default: 4096
  temperature?: number;    // 0-1, default: 0
  timeout?: number;        // Request timeout in ms
  baseUrl?: string;        // Custom API URL
  streaming?: boolean;     // Enable streaming
}

Available Models

ModelBest For
claude-3-5-sonnet-20241022Best balance (recommended)
claude-3-5-haiku-20241022Fast and cheap
claude-3-opus-20240229Most capable

Google Gemini

Use Google's Gemini models with generous free tier.

Setup

bash
export GOOGLE_AI_API_KEY=...

Usage

typescript
import { GeminiAdapter } from "@andresaya/flowkit";

const adapter = new GeminiAdapter({
  apiKey: process.env.GOOGLE_AI_API_KEY!,
  model: "gemini-1.5-flash",
});

Configuration

typescript
interface GeminiConfig {
  apiKey: string;          // API key (required)
  model?: string;          // Default: "gemini-1.5-flash"
  temperature?: number;    // 0-1, default: 0
  maxOutputTokens?: number; // Default: 4096
  timeout?: number;        // Request timeout in ms
  baseUrl?: string;        // Custom API URL
  streaming?: boolean;     // Enable streaming
}

Available Models

ModelBest For
gemini-1.5-flashFast and free tier friendly
gemini-1.5-proMore capable
gemini-2.0-flash-expLatest experimental

Groq

Ultra-fast inference with Groq's LPU technology. Free tier available!

Setup

bash
export GROQ_API_KEY=gsk_...

Usage

typescript
import { GroqAdapter } from "@andresaya/flowkit";

const adapter = new GroqAdapter({
  apiKey: process.env.GROQ_API_KEY!,
  model: "llama-3.3-70b-versatile",
});

Configuration

typescript
interface GroqConfig {
  apiKey: string;          // API key (required)
  model?: string;          // Default: "llama-3.3-70b-versatile"
  temperature?: number;    // 0-2, default: 0
  maxTokens?: number;      // Default: 4096
  timeout?: number;        // Request timeout in ms
  baseUrl?: string;        // Custom API URL
  streaming?: boolean;     // Enable streaming
}

Available Models

ModelBest For
llama-3.3-70b-versatileBest quality (recommended)
llama-3.1-8b-instantUltra fast
mixtral-8x7b-32768Good balance
gemma2-9b-itCompact and fast

Note: Groq offers extremely fast inference (100+ tokens/sec) with a generous free tier!


OpenRouter

Access 100+ models through a single API with OpenRouter.

Setup

bash
# Set API key
export OPENROUTER_API_KEY=sk-or-...

Usage

typescript
import { OpenRouterAdapter } from "@andresaya/flowkit";

const adapter = new OpenRouterAdapter({
  apiKey: process.env.OPENROUTER_API_KEY!,
  model: "anthropic/claude-3-haiku",
});

const engine = new FlowEngine(flow, { llm: adapter, storage });

Configuration

typescript
interface OpenRouterConfig {
  apiKey: string;          // API key (required)
  model?: string;          // Model ID, default "openai/gpt-4o-mini"
  appName?: string;        // Your app name for rankings
  siteUrl?: string;        // Your site URL for rankings
  temperature?: number;    // Model-specific range
  timeout?: number;        // Request timeout in ms
  streaming?: boolean;     // Enable streaming
}
Model IDProviderCost
openai/gpt-4o-miniOpenAI$
openai/gpt-4oOpenAI$$$
anthropic/claude-3-haikuAnthropic$
anthropic/claude-3-sonnetAnthropic$$
anthropic/claude-3-opusAnthropic$$$
google/gemini-proGoogle$
meta-llama/llama-3-70b-instructMeta$$
mistralai/mistral-largeMistral$$

See OpenRouter Models for full list.


Creating Custom Adapters

Implement the LLMAdapter interface:

typescript
import type { LLMAdapter, LLMResponse, JsonValue, StreamHandler, ToolDefinition } from "@andresaya/flowkit";

class CustomAdapter implements LLMAdapter {
  readonly supportsStreaming = true;
  readonly supportsTools = true;

  async chat(args: {
    systemPrompt: string;
    messages: Array<{ role: "user" | "assistant"; content: string }>;
    userMessage: string;
    responseSchema: JsonValue;
    tools?: ToolDefinition[];
  }): Promise<LLMResponse> {
    // Call your provider and return a structured response
    return {
      understood: true,
      extracted: null,
      confidence: 1,
      response: "OK",
      nextStep: null,
      actions: [],
    };
  }

  async chatStream(
    args: {
      systemPrompt: string;
      messages: Array<{ role: "user" | "assistant"; content: string }>;
      userMessage: string;
      responseSchema: JsonValue;
      tools?: ToolDefinition[];
    },
    onChunk: StreamHandler
  ): Promise<LLMResponse> {
    onChunk({ type: "text", content: "Hello" });
    onChunk({ type: "done" });
    return {
      understood: true,
      extracted: null,
      confidence: 1,
      response: "Hello",
      nextStep: null,
      actions: [],
    };
  }
}

Environment Variables

bash
# .env file

# OpenAI
OPENAI_API_KEY=sk-...

# OpenRouter  
OPENROUTER_API_KEY=sk-or-...

# Ollama (optional, defaults to localhost)
OLLAMA_HOST=http://localhost:11434

Choosing a Provider

Use CaseRecommended
Development/TestingOllama (free, local)
Production (budget)OpenAI gpt-4o-mini
Production (quality)OpenAI gpt-4o or Claude
Model experimentationOpenRouter
Privacy-sensitiveOllama (local)
Streaming UXOpenAI or OpenRouter

Released under the MIT License.