Chat Completions API
OpenAI-compatible API with Konnect extensions for multi-model patterns.
/v1/chat/completionsCreates a chat completion. Fully compatible with the OpenAI API format, with Konnect extensions for multi-model patterns like Ensemble, Debate, and Council.
Authentication
All API requests require authentication via Bearer token.
Authorization: Bearer YOUR_API_KEYRequest Body
modelrequiredModel ID to use. Use konnect-ensemble, konnect-debate, or konnect-council for multi-model patterns.
Type: string
messagesrequiredArray of messages in the conversation. Each message has a role and content.
Type: array
streamoptionalWhether to stream the response via Server-Sent Events. Defaults to false.
Type: boolean
Konnect Extensions
konnect.patternPattern type: ensemble, debate, council
konnect.modelsArray of model IDs to use in the pattern.
konnect.aggregationAggregation method for ensemble: synthesis, best_of_n, consensus, union
konnect.personasArray of persona configurations for each model slot.
konnect.enable_toolsEnable web search for models. Defaults to true. When enabled, models can perform web searches and timeouts adjust dynamically.
Example Requests
Standard Chat
curl https://api.konnect.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
],
"stream": true
}'Ensemble Mode
curl https://api.konnect.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "konnect-ensemble",
"messages": [
{"role": "user", "content": "What are the pros and cons of microservices?"}
],
"stream": true,
"konnect.pattern": "ensemble",
"konnect.models": ["gpt-4o", "claude-sonnet-4-5-20250929", "gemini-2.0-flash"],
"konnect.aggregation": "synthesis"
}'Debate Mode
curl https://api.konnect.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "konnect-debate",
"messages": [
{"role": "user", "content": "Should AI development be regulated?"}
],
"stream": true,
"konnect.pattern": "debate",
"konnect.models": ["gpt-4o", "gemini-2.0-flash", "claude-sonnet-4-5-20250929"],
"konnect.personas": [
{"personaId": "advocate", "modelId": "gpt-4o"},
{"personaId": "critic", "modelId": "gemini-2.0-flash"},
{"personaId": "impartial_judge", "modelId": "claude-sonnet-4-5-20250929"}
]
}'The three personas correspond to Pro (advocate), Con (critic), and Judge roles. Personas are optional—defaults are used if omitted.
Streaming Response
When stream: true, responses are sent via Server-Sent Events (SSE).
Standard Chunk
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1234567890,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}Konnect Metadata Chunk
For pattern responses, Konnect includes additional metadata in konnect.metadata:
data: {
"konnect.metadata": {
"event": "model_complete",
"model_completed": "gpt-5.2",
"model_success": true,
"models_completed": 1,
"models_total": 3,
"models_successful": 1,
"models_failed": 0,
"tool_calls_made": 2
}
}data: {
"konnect.metadata": {
"pattern_used": "debate",
"debate_event": "round",
"debate_new_round": {
"round_number": 1,
"pro_argument": "...",
"con_argument": "..."
}
}
}Ensemble Streaming Events
model_completeFired when each model finishes. Includes model name, success/failure status, content (if successful), and progress count.
aggregationFinal aggregated response from all successful models, streamed in chunks.
Debate Streaming Events
setupOpening positions from Pro and Con are ready. Includes pro_success and con_success flags.
roundA debate round completed. Contains round number and both arguments. Pro argues first, then Con responds sequentially.
judgeJudge's verdict is ready. Contains winner and reasoning.
Response Format
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1705312345,
"model": "konnect-ensemble",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Based on analysis from multiple AI models..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 450,
"total_tokens": 475
},
"konnect.metadata": {
"pattern_used": "ensemble",
"aggregation_method": "synthesis",
"models_successful": 3,
"models_failed": 0,
"model_responses": [
{
"model": "gpt-4o",
"content": "...",
"latency_ms": 1250,
"success": true,
"tool_calls_made": 2,
"tool_call_details": [
{"tool": "web_search", "query": "..."},
{"tool": "web_search", "query": "..."}
]
},
{
"model": "claude-sonnet-4-5-20250929",
"content": "...",
"latency_ms": 980,
"success": true,
"tool_calls_made": 0
},
{
"model": "gemini-2.0-flash",
"content": "...",
"latency_ms": 1100,
"success": true,
"tool_calls_made": 1
}
],
"total_latency_ms": 1250,
"estimated_cost": 0.0045
}
}The tool_calls_made field shows how many web searches each model performed. The tool_call_details array provides specifics about each tool call.
Available Models
Provider Models
gpt-5.2OpenAIgpt-5.2-chat-latestOpenAIgpt-4oOpenAIgpt-4-turboOpenAIclaude-sonnet-4-5-20250929Anthropicclaude-3-haiku-20240307Anthropicgemini-2.0-flashGoogleKonnect Virtual Models
konnect-ensembleEnsemble modekonnect-debateDebate modekonnect-councilCouncil modeSDK Compatibility
Konnect is compatible with OpenAI SDKs. Use the baseURL parameter to point to Konnect.
Python
from openai import OpenAI
client = OpenAI(
base_url="https://api.konnect.ai/v1",
api_key="your-konnect-api-key"
)
# Ensemble mode with Konnect extensions
response = client.chat.completions.create(
model="konnect-ensemble",
messages=[{"role": "user", "content": "Explain AI"}],
extra_body={
"konnect.pattern": "ensemble",
"konnect.models": ["gpt-4o", "claude-sonnet-4-5-20250929"],
"konnect.aggregation": "synthesis"
}
)Node.js
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.konnect.ai/v1',
apiKey: 'your-konnect-api-key'
});
const response = await client.chat.completions.create({
model: 'konnect-debate',
messages: [{ role: 'user', content: 'Is remote work better?' }],
stream: true,
// Konnect extensions
'konnect.pattern': 'debate',
'konnect.models': ['gpt-4o', 'gemini-2.0-flash', 'claude-sonnet-4-5-20250929']
});Graceful Degradation
Multi-model patterns are designed for reliability. If some models fail (rate limits, timeouts, errors), you still get results from the working models.
Partial Success (200 OK)
If at least one model succeeds, the API returns 200 OK with aggregated results from working models.
*Note: 1 model(s) failed (claude-sonnet-4-5). Results are from 2 successful model(s).*All Models Failed (200 OK with error message)
If all models fail, you get a 200 OK with a detailed error message listing each model's specific error and troubleshooting suggestions.
Dynamic Timeouts
When models use web search, timeouts adjust automatically to accommodate the extra time needed for tool calls.
Base timeout60sDefault timeout for model responses without tool calls.
Per tool call+20sAdditional time allocated for each web search or tool call made.
Maximum timeout180sHard limit to prevent indefinite waiting.
Example: A model making 3 web searches gets 60 + (3 × 20) = 120s timeout.
Error Handling
{
"error": {
"message": "Invalid model specified",
"type": "invalid_request_error",
"code": "model_not_found"
}
}401Invalid or missing API key400Invalid request or model not found429Rate limit exceeded500Internal server error (unrecoverable)Note: Multi-model patterns use graceful degradation. Partial model failures return 200 OK with results from working models, not 502.