Skip to content

Testing Utilities

FlowKit provides testing utilities to unit test your conversational flows without needing a real LLM.

Overview

The testing module includes:

  • MockLLMAdapter - Simulate LLM responses
  • MockStorage - In-memory storage with tracking
  • FlowTester - Scenario-based test runner
  • validateFlow() - Static flow validation

MockLLMAdapter

Simulate LLM responses for predictable tests.

typescript
import { MockLLMAdapter } from "@andresaya/flowkit";

const mockLLM = new MockLLMAdapter();

// Set response for specific steps
mockLLM.onStep("get_name", {
    extracted: "John Doe",
    message: "Nice to meet you, John!",
});

mockLLM.onStep("confirm", {
    extracted: "yes",
    message: "Great! Let me process that.",
});

// Set default response for unmatched steps
mockLLM.setDefault({
    message: "I understand.",
    extracted: true,
});

Dynamic Responses

typescript
// Respond based on user input
mockLLM.onStep("get_name", (userMessage, stepId, slots) => {
    const name = userMessage.match(/my name is (\w+)/i)?.[1] || "Unknown";
    return {
        extracted: name,
        message: `Hello ${name}!`,
    };
});

// Simulate failures
mockLLM.onStep("get_email", {
    fail: true,
    message: "I didn't catch that. Could you repeat your email?",
});

// Add delays for timing tests
mockLLM.onStep("slow_step", {
    extracted: "value",
    delay: 2000,  // 2 second delay
});

Inspecting Calls

typescript
// Check if step was called
expect(mockLLM.wasStepCalled("get_name")).toBe(true);

// Get call count
expect(mockLLM.getCallCount("confirm")).toBe(1);

// Get full call history
const history = mockLLM.getHistory();
console.log(history);
// [
//   { step: "greeting", userMessage: "hi", timestamp: 1234567890 },
//   { step: "get_name", userMessage: "John", timestamp: 1234567900 },
// ]

// Clear history between tests
mockLLM.clearHistory();

MockStorage

In-memory storage that tracks all operations.

typescript
import { MockStorage } from "@andresaya/flowkit";

const mockStorage = new MockStorage();

// Pre-set state for a conversation
mockStorage.setState("user-123", {
    currentStep: "confirm",
    slots: { name: "John", email: "john@example.com" },
    history: [],
    ended: false,
});

// Inspect operations after test
const ops = mockStorage.getOperations();
console.log(ops);
// [
//   { type: "load", conversationId: "user-123", timestamp: 1234567890 },
//   { type: "save", conversationId: "user-123", timestamp: 1234567900 },
// ]

// Clear between tests
mockStorage.clear();

FlowTester

Run scenario-based tests against your flows.

typescript
import { FlowTester, MockLLMAdapter } from "@andresaya/flowkit";

// Create test runner
const runner = new FlowTester(myFlow);

// Configure mock responses
runner.mock()
    .onStep("greeting", { message: "Welcome!", extracted: true })
    .onStep("get_name", { extracted: "Alice", message: "Hi Alice!" })
    .onStep("confirm", { extracted: "yes", message: "Confirmed!" });

Running Test Scenarios

typescript
const result = await runner.runScenario({
    name: "Happy path test",
    
    // Initial state (optional)
    initialSlots: {
        source: "web",
    },
    
    // Test steps
    steps: [
        {
            say: "Hello",
            expectStep: "get_name",
            expectResponse: /welcome|hello/i,
        },
        {
            say: "My name is Alice",
            expectStep: "confirm",
            expectSlots: { user_name: "Alice" },
        },
        {
            say: "Yes, confirmed",
            expectStep: "done",
            expectDone: true,
        },
    ],
    
    // Final assertions
    expectFinalSlots: {
        user_name: "Alice",
        confirmed: "yes",
    },
});

console.log(result);
// {
//   passed: true,
//   errors: [],
//   results: [ ... test results ... ],
// }

Custom Assertions

typescript
const result = await runner.runScenario({
    name: "Custom validation test",
    steps: [
        {
            say: "I want to order",
            assert: async (result) => {
                // Custom assertion logic
    if (!result.slots.order_started) {
        throw new Error("Order should be started");
    }
                if (result.message.length < 10) {
                    throw new Error("Response too short");
                }
            },
        },
    ],
});

Quick Simulation

For simple tests without full scenarios:

typescript
const results = await runner.simulate([
    "Hello",
    "My name is Bob",
    "Yes please",
    "Thanks!",
]);

console.log(results);
// [
//   { message: "Welcome!", step: "greeting", slots: {}, done: false },
//   { message: "Hi Bob!", step: "confirm", slots: { name: "Bob" }, done: false },
//   { message: "Great!", step: "done", slots: { name: "Bob", confirmed: "yes" }, done: true },
// ]

Flow Validation

Validate flow configuration without running it.

typescript
import { validateFlow } from "@andresaya/flowkit";

const errors = validateFlow(myFlow);

if (errors.length > 0) {
    console.log("Flow validation errors:");
    for (const error of errors) {
        console.log(`  [${error.type}] ${error.message}`);
        if (error.step) {
            console.log(`    at step: ${error.step}`);
        }
    }
}

Validation Checks

  • Missing steps: References to non-existent steps
  • Orphan steps: Steps not reachable from the start
  • Missing slots: Extracts without a target slot (warning)
  • Dead ends: Steps with no next, branches, or end
typescript
// Example output
// [
//   { type: "error", message: "Step 'checkout' references missing step 'payment'" },
//   { type: "warning", message: "Step 'old_promo' is not reachable", step: "old_promo" },
//   { type: "error", message: "Branch 'other' references missing step 'unknown'" },
// ]

Integration with Test Frameworks

Jest

typescript
import { FlowTester } from "@andresaya/flowkit";
import { myFlow } from "../src/flows/my-flow";

describe("MyFlow", () => {
    let runner: FlowTester;

    beforeEach(() => {
        runner = new FlowTester(myFlow);
        runner.mock()
            .onStep("greeting", { message: "Welcome!", extracted: true })
            .onStep("get_name", { extracted: "Test User" });
    });

    it("should complete happy path", async () => {
        const result = await runner.runScenario({
            name: "happy path",
            steps: [
                { say: "Hi", expectStep: "get_name" },
                { say: "Test User", expectStep: "done" },
            ],
        });

        expect(result.passed).toBe(true);
        expect(result.errors).toHaveLength(0);
    });

    it("should handle invalid input", async () => {
        runner.mock().onStep("get_email", { fail: true });

        const result = await runner.runScenario({
            name: "invalid email",
            steps: [
                { say: "Hi", expectStep: "get_name" },
                { say: "invalid", expectStep: "get_email" }, // Should retry
            ],
        });

        expect(runner.mock().getCallCount("get_email")).toBe(2);
    });
});

Vitest

typescript
import { describe, it, expect, beforeEach } from "vitest";
import { FlowTester } from "@andresaya/flowkit";

describe("OrderFlow", () => {
    // Same pattern as Jest
});

Node.js Test Runner

typescript
import { test, describe, beforeEach } from "node:test";
import assert from "node:assert";
import { FlowTester } from "@andresaya/flowkit";

describe("MyFlow", () => {
    test("should complete flow", async () => {
        const runner = new FlowTester(myFlow);
        // ... setup and assertions
        assert.strictEqual(result.passed, true);
    });
});

Assertions

typescript
import { assertions } from "@andresaya/flowkit";

const results = await runner.simulate(["Hi", "John", "Yes"]);

assertions.isStep(results[1], "confirm");
assertions.slotsContain(results[1].slots, { name: "John" });
assertions.isDone(results[2]);

Tips

  1. Test edge cases - Invalid input, empty responses, timeouts
  2. Test branching - Ensure all .when() paths are covered
  3. Use dynamic mocks - Test context-dependent behavior
  4. Validate first - Run validateFlow() before testing
  5. Reset between tests - Clear mock history and storage
  6. Test in isolation - Each test should be independent

Released under the MIT License.