LLM Providers

FlowKit supports multiple LLM providers through adapters. Each adapter implements the LLMAdapter interface.

Supported Providers

Provider	Local	Streaming	Tool Calling	Cost
Ollama	Yes	No	Yes	Free
OpenAI	No	Yes	Yes	Paid
OpenRouter	No	Yes	Yes	Free/Paid
Anthropic (Claude)	No	Yes	Yes	Paid
Google Gemini	No	Yes	Yes	Free/Paid
Groq	No	Yes	Yes	Free*

Notes:
Ollama Tool Calling: Requires models that support tools (Llama 3.1+, Qwen 3, Mistral Nemo, Command-R+, etc.)
OpenRouter Free/Paid: OpenRouter offers many free models (DeepSeek, Gemma, some Llama variants) with rate limits, plus paid models.
Gemini Free/Paid: Generous free tier with rate limits, paid for higher usage.
Groq Free*: Free tier with rate limits (ultra-fast inference).

Ollama (Local)

Run LLMs locally with Ollama.

Setup

bash

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model
ollama pull qwen3:4b
ollama pull llama3.2

Usage

typescript

import { OllamaAdapter } from "@andresaya/flowkit";

const adapter = new OllamaAdapter({
  model: "qwen3:4b",
  baseUrl: "http://localhost:11434", // Optional, this is default
});

const engine = new FlowEngine(flow, { llm: adapter, storage });

Configuration

typescript

interface OllamaConfig {
  model: string;           // Model name (required)
  baseUrl?: string;        // Ollama base URL
  temperature?: number;    // 0-1, default varies by model
  timeout?: number;        // Request timeout in ms
}

Recommended Models

Model	Size	Best For
`qwen3:4b`	4B	Best balance of speed and JSON extraction
`qwen3:8b`	8B	Better reasoning
`llama3.2`	3B	Fast, but weaker extraction
`llama3.1:8b`	8B	Good general purpose

Tip: For strict mode with JSON extraction, qwen3:4b performs better than llama3.2.

OpenAI

Use OpenAI's GPT models with streaming and tool calling.

Setup

bash

export OPENAI_API_KEY=sk-...

Usage

typescript

import { OpenAIAdapter } from "@andresaya/flowkit";

const adapter = new OpenAIAdapter({
  apiKey: process.env.OPENAI_API_KEY!,
  model: "gpt-4o-mini",
});

Configuration

typescript

interface OpenAIConfig {
  apiKey: string;          // API key (required)
  model?: string;          // Model name, default "gpt-4o-mini"
  baseUrl?: string;        // Custom API URL
  temperature?: number;    // 0-2, default 0
  timeout?: number;        // Request timeout in ms
  streaming?: boolean;     // Enable streaming
}

Available Models

Model	Context	Best For
`gpt-4o-mini`	128K	Fast, cheap, good quality
`gpt-4o`	128K	Best quality
`gpt-5.2`	400K	Complex tasks, coding, agentic workflows
`gpt-5-mini`	400K	Faster, cost-efficient, well-defined tasks
`gpt-5-nano`	400K	Fastest, most cost-efficient, basic tasks

Anthropic (Claude)

Use Anthropic's Claude models - excellent for reasoning and safety.

Setup

bash

export ANTHROPIC_API_KEY=sk-ant-...

Usage

typescript

import { AnthropicAdapter } from "@andresaya/flowkit";

const adapter = new AnthropicAdapter({
  apiKey: process.env.ANTHROPIC_API_KEY!,
  model: "claude-3-5-sonnet-20241022",
});

Configuration

typescript

interface AnthropicConfig {
  apiKey: string;          // API key (required)
  model?: string;          // Default: "claude-3-5-sonnet-20241022"
  maxTokens?: number;      // Default: 4096
  temperature?: number;    // 0-1, default: 0
  timeout?: number;        // Request timeout in ms
  baseUrl?: string;        // Custom API URL
  streaming?: boolean;     // Enable streaming
}

Available Models

Model	Best For
`claude-3-5-sonnet-20241022`	Best balance (recommended)
`claude-3-5-haiku-20241022`	Fast and cheap
`claude-3-opus-20240229`	Most capable

Google Gemini

Use Google's Gemini models with generous free tier.

Setup

bash

export GOOGLE_AI_API_KEY=...

Usage

typescript

import { GeminiAdapter } from "@andresaya/flowkit";

const adapter = new GeminiAdapter({
  apiKey: process.env.GOOGLE_AI_API_KEY!,
  model: "gemini-1.5-flash",
});

Configuration

typescript

interface GeminiConfig {
  apiKey: string;          // API key (required)
  model?: string;          // Default: "gemini-1.5-flash"
  temperature?: number;    // 0-1, default: 0
  maxOutputTokens?: number; // Default: 4096
  timeout?: number;        // Request timeout in ms
  baseUrl?: string;        // Custom API URL
  streaming?: boolean;     // Enable streaming
}

Available Models

Model	Best For
`gemini-1.5-flash`	Fast and free tier friendly
`gemini-1.5-pro`	More capable
`gemini-2.0-flash-exp`	Latest experimental

Groq

Ultra-fast inference with Groq's LPU technology. Free tier available!

Setup

bash

export GROQ_API_KEY=gsk_...

Usage

typescript

import { GroqAdapter } from "@andresaya/flowkit";

const adapter = new GroqAdapter({
  apiKey: process.env.GROQ_API_KEY!,
  model: "llama-3.3-70b-versatile",
});

Configuration

typescript

interface GroqConfig {
  apiKey: string;          // API key (required)
  model?: string;          // Default: "llama-3.3-70b-versatile"
  temperature?: number;    // 0-2, default: 0
  maxTokens?: number;      // Default: 4096
  timeout?: number;        // Request timeout in ms
  baseUrl?: string;        // Custom API URL
  streaming?: boolean;     // Enable streaming
}

Available Models

Model	Best For
`llama-3.3-70b-versatile`	Best quality (recommended)
`llama-3.1-8b-instant`	Ultra fast
`mixtral-8x7b-32768`	Good balance
`gemma2-9b-it`	Compact and fast

Note: Groq offers extremely fast inference (100+ tokens/sec) with a generous free tier!

OpenRouter

Access 100+ models through a single API with OpenRouter.

Setup

bash

# Set API key
export OPENROUTER_API_KEY=sk-or-...

Usage

typescript

import { OpenRouterAdapter } from "@andresaya/flowkit";

const adapter = new OpenRouterAdapter({
  apiKey: process.env.OPENROUTER_API_KEY!,
  model: "anthropic/claude-3-haiku",
});

const engine = new FlowEngine(flow, { llm: adapter, storage });

Configuration

typescript

interface OpenRouterConfig {
  apiKey: string;          // API key (required)
  model?: string;          // Model ID, default "openai/gpt-4o-mini"
  appName?: string;        // Your app name for rankings
  siteUrl?: string;        // Your site URL for rankings
  temperature?: number;    // Model-specific range
  timeout?: number;        // Request timeout in ms
  streaming?: boolean;     // Enable streaming
}

Popular Models

Model ID	Provider	Cost
`openai/gpt-4o-mini`	OpenAI	$
`openai/gpt-4o`	OpenAI	$$$
`anthropic/claude-3-haiku`	Anthropic	$
`anthropic/claude-3-sonnet`	Anthropic	$$
`anthropic/claude-3-opus`	Anthropic	$$$
`google/gemini-pro`	Google	$
`meta-llama/llama-3-70b-instruct`	Meta	$$
`mistralai/mistral-large`	Mistral	$$

See OpenRouter Models for full list.

Creating Custom Adapters

Implement the LLMAdapter interface:

typescript

import type { LLMAdapter, LLMResponse, JsonValue, StreamHandler, ToolDefinition } from "@andresaya/flowkit";

class CustomAdapter implements LLMAdapter {
  readonly supportsStreaming = true;
  readonly supportsTools = true;

  async chat(args: {
    systemPrompt: string;
    messages: Array<{ role: "user" | "assistant"; content: string }>;
    userMessage: string;
    responseSchema: JsonValue;
    tools?: ToolDefinition[];
  }): Promise<LLMResponse> {
    // Call your provider and return a structured response
    return {
      understood: true,
      extracted: null,
      confidence: 1,
      response: "OK",
      nextStep: null,
      actions: [],
    };
  }

  async chatStream(
    args: {
      systemPrompt: string;
      messages: Array<{ role: "user" | "assistant"; content: string }>;
      userMessage: string;
      responseSchema: JsonValue;
      tools?: ToolDefinition[];
    },
    onChunk: StreamHandler
  ): Promise<LLMResponse> {
    onChunk({ type: "text", content: "Hello" });
    onChunk({ type: "done" });
    return {
      understood: true,
      extracted: null,
      confidence: 1,
      response: "Hello",
      nextStep: null,
      actions: [],
    };
  }
}

Environment Variables

bash

# .env file

# OpenAI
OPENAI_API_KEY=sk-...

# OpenRouter  
OPENROUTER_API_KEY=sk-or-...

# Ollama (optional, defaults to localhost)
OLLAMA_HOST=http://localhost:11434

Choosing a Provider

Use Case	Recommended
Development/Testing	Ollama (free, local)
Production (budget)	OpenAI gpt-4o-mini
Production (quality)	OpenAI gpt-4o or Claude
Model experimentation	OpenRouter
Privacy-sensitive	Ollama (local)
Streaming UX	OpenAI or OpenRouter

LLM Providers ​

Supported Providers ​

Ollama (Local) ​

Setup ​

Usage ​

Configuration ​

Recommended Models ​

OpenAI ​

Setup ​

Usage ​

Configuration ​

Available Models ​

Anthropic (Claude) ​

Setup ​

Usage ​

Configuration ​

Available Models ​

Google Gemini ​

Setup ​

Usage ​

Configuration ​

Available Models ​

Groq ​

Setup ​

Usage ​

Configuration ​

Available Models ​

OpenRouter ​

Setup ​

Usage ​

Configuration ​

Popular Models ​

Creating Custom Adapters ​

Environment Variables ​

Choosing a Provider ​

LLM Providers

Supported Providers

Ollama (Local)

Setup

Usage

Configuration

Recommended Models

OpenAI

Setup

Usage

Configuration

Available Models

Anthropic (Claude)

Setup

Usage

Configuration

Available Models

Google Gemini

Setup

Usage

Configuration

Available Models

Groq

Setup

Usage

Configuration

Available Models

OpenRouter

Setup

Usage

Configuration

Popular Models

Creating Custom Adapters

Environment Variables

Choosing a Provider