Ensemble

Pro

Query multiple AI models simultaneously and synthesize their responses for better accuracy.

Overview

Ensemble mode is the core feature of Konnect.ai. It sends your query to multiple AI models in parallel and combines their responses using your chosen aggregation method. This approach significantly reduces hallucinations and provides more reliable answers. By default, queries use GPT-5.2 and Claude Sonnet 4.5 with synthesis aggregation.

2-5
Models per query
4
Aggregation methods
2
View modes
~40%
Fewer hallucinations

How It Works

1

Parallel Queries

Your question is sent to all selected models simultaneously (e.g., GPT-5.2, Claude Sonnet 4.5, Gemini Pro).

2

Progressive Streaming

Results stream as each model completes—you see progress immediately, not after all models finish. Each model's response appears as soon as it's ready.

3

Graceful Degradation

If some models fail (rate limits, errors), you still get results from the working models. Failed models are noted in the response metadata—no complete failures unless all models fail.

4

Aggregation

Successful responses are combined using your chosen method: synthesis, best-of-n, consensus, or union.

5

Final Response

You receive a synthesized response with full transparency: which models succeeded, which failed, and why.

Using the Interface

Selecting Models

Choose which AI models to query using the model slots below the chat input. By default, Ensemble uses GPT-5.2 and Claude Sonnet 4.5.

Add/Remove Models

Click the + button to add models (up to 5), or click × to remove. More models = more perspectives but higher cost.

Change Model

Click the model badge (e.g., "5.2") to switch between GPT-5.2, GPT-4o, Claude Sonnet, Claude Haiku, Gemini Flash, etc.

Choosing Aggregation

Select how responses should be combined using the aggregation dropdown (gear icon). This choice is locked after you send your first message.

Tip: Start a new chat to change the aggregation method. The dropdown shows "Start new chat to change aggregation" when locked.

Viewing Results

Ensemble responses can be viewed in two layouts:

Tabs View

Default

First tab shows the synthesized result. Additional tabs show each model's individual response.

Side-by-Side View

All model responses displayed in a grid for easy comparison. Each card shows model name, latency, and response.

Consensus Indicators

The UI automatically calculates how much models agree and shows visual indicators:

High Consensus (≥80%)

Green badge appears with share options. Models strongly agree—high confidence in the answer.

Low Consensus (<50%)

Orange/red banner highlights key disagreements. Review individual responses to understand different perspectives.

Graceful Degradation

Ensemble mode is designed for reliability. If one or more models encounter errors, you still get useful results.

Partial Success

If 2 of 3 models succeed, you get aggregated results from the working models with a note about which model failed.

*Note: 1 model(s) failed (gpt-5.2). Results are from 2 successful model(s).*

All Models Failed

If all models fail, you get a detailed error message with each model's specific error and troubleshooting suggestions (check API keys, billing status, rate limits).

Aggregation Methods

Synthesis

Recommended

An AI synthesizer creates a unified response combining the best elements from all model responses. This produces the most coherent and comprehensive answer.

aggregation: "synthesis"

Best-of-N

All responses are evaluated for quality, accuracy, and relevance. The highest-scoring response is returned as the final answer.

aggregation: "best_of_n"

Consensus

Highlights points where models agree and flags disagreements. Perfect for fact-checking and identifying potential hallucinations.

aggregation: "consensus"

Union

Combines all unique points from every response, providing the most comprehensive coverage of the topic.

aggregation: "union"

When to Use Ensemble

Important decisions that need verification
Factual questions where accuracy is critical
Technical questions with multiple valid approaches
Research tasks requiring comprehensive coverage
When you want to reduce AI hallucinations

Streaming Events

Ensemble mode streams progress events via Server-Sent Events (SSE) as each model completes:

model_complete

Fired when each model finishes. Includes model name, success/failure status, content (if successful), and progress (e.g., "2/3 models completed").

aggregation

Final aggregated response from all successful models, streamed in chunks.

Example streaming metadata
{
  "konnect.metadata": {
    "event": "model_complete",
    "model_completed": "gpt-5.2",
    "model_success": true,
    "models_completed": 1,
    "models_total": 3,
    "models_successful": 1,
    "models_failed": 0
  }
}

API Usage

Konnect uses an OpenAI-compatible API with extensions for ensemble mode.

cURL
curl https://api.konnect.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "konnect-ensemble",
    "messages": [
      {"role": "user", "content": "What are the health benefits of intermittent fasting?"}
    ],
    "stream": true,
    "konnect.pattern": "ensemble",
    "konnect.models": ["gpt-5.2", "claude-sonnet-4-5-20250929", "gemini-2.0-flash"],
    "konnect.aggregation": "synthesis"
  }'

Results stream progressively as each model completes. The final response includes metadata about which models succeeded or failed.

Explore other modes