Quick Start

TL;DR: Point your OpenAI SDK to Igris Overture. Get automatic routing, failover, and cost optimization in 5 minutes.


Get Started in 5 Minutes

Step 1: Start Free Trial

No credit card required. All features unlocked for 14 days.

Start Free Trial →

Step 2: Get Your API Key

After signup, copy your API key from the dashboard:

Dashboard → Settings → API Keys → Copy

Your API key looks like: sk-igris-abc123...

Step 3: Make Your First Request

Using Python (OpenAI SDK):

from openai import OpenAI

client = OpenAI(
    base_url="https://api.igrisinertial.com/v1",
    api_key="sk-igris-YOUR_KEY"
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)
print(f"Cost: ${response.metadata['cost_usd']:.4f}")
print(f"Provider: {response.metadata['provider']}")

Using Node.js:

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.igrisinertial.com/v1',
  apiKey: 'sk-igris-YOUR_KEY'
});

const response = await client.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Hello!' }]
});

console.log(response.choices[0].message.content);
console.log(`Cost: $${response.metadata.cost_usd}`);
console.log(`Provider: ${response.metadata.provider}`);

Using cURL:

curl -X POST https://api.igrisinertial.com/v1/infer \
  -H "Authorization: Bearer sk-igris-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "user", "content": "Explain Thompson Sampling in one sentence"}
    ]
  }'

Step 4: View in Dashboard

Check your dashboard to see:

  • Real-time cost tracking
  • Routing decisions (which provider was used)
  • Latency metrics
  • Token usage

View Dashboard →


What You Get in the Response

Every request includes metadata about routing:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Thompson Sampling is a Bayesian algorithm that balances exploration and exploitation..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 28,
    "total_tokens": 40
  },
  "metadata": {
    "provider": "anthropic",              // Which provider handled the request
    "routing_decision": "thompson-sampling", // How the decision was made
    "cost_usd": 0.00034,                  // Actual cost in USD
    "latency_ms": 187,                    // Request latency
    "trace_id": "550e8400-e29b..."        // Distributed trace ID
  }
}

Advanced Features

Cost Optimization

Request the cheapest provider that meets quality standards:

curl -X POST https://api.igrisinertial.com/v1/infer \
  -H "Authorization: Bearer sk-igris-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [{"role": "user", "content": "Simple question"}],
    "optimization": "cost"
  }'
Latency Optimization

Request the fastest provider:

curl -X POST https://api.igrisinertial.com/v1/infer \
  -H "Authorization: Bearer sk-igris-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Urgent question"}],
    "optimization": "latency"
  }'
Speculative Execution (Growth+ Tier)

Race multiple providers for 60% faster time-to-first-token:

curl -X POST https://api.igrisinertial.com/v1/infer \
  -H "Authorization: Bearer sk-igris-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Fast response needed"}],
    "speculative_mode": "latency"
  }'
Streaming Responses

Get tokens as they're generated:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.igrisinertial.com/v1",
    api_key="sk-igris-YOUR_KEY"
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end='', flush=True)

Next Steps

1. Configure Providers

Add your provider API keys with BYOK: Providers & Keys →

2. Set Up Routing Policies

Configure Thompson Sampling, semantic routing, or custom policies: Routing Policies →

3. Enable Multi-Tenancy

Set up tenant isolation and per-tenant budgets: Multi-Tenancy →

4. Monitor & Optimize

Set up dashboards and alerts: Observability →


Get Help