Sage Inference

High-performance LLM inference API. OpenAI-compatible, multi-model, streaming, and multi-modal. Base URL: https://sage-api.devblocktechnologies.com

Overview

Sage Inference provides a unified API for accessing cutting-edge language models from DeepSeek, Anthropic, OpenAI, and more. The API is fully compatible with the OpenAI client SDK, making integration seamless.

OpenAI-compatible API

Drop-in replacement for OpenAI API. Use any existing OpenAI SDK.

Multiple Models

Access 8+ models across providers from a single endpoint.

Streaming

Server-sent events streaming for real-time token delivery.

Multi-modal

Vision, audio, and image generation capabilities.

Quick Start

Send your first chat completion request with curl:

curl https://sage-api.devblocktechnologies.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-sage-your-api-key" \
  -d '{
    "model": "deepseek-v4-flash",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "temperature": 0.7,
    "max_tokens": 256
  }'

View Full API Reference →

Available Models

Model ID	Description	Cost
deepseek-v4-flash	Fast chat	0.0004 credits/token
deepseek-v4-pro	Advanced reasoning/coding	0.002 credits/token
deepseek-reasoner	Deep reasoning	0.002 credits/token
claude-sonnet-4-20250514	Anthropic's most balanced	0.006 credits/token
claude-haiku-3.5	Fast Claude	0.002 credits/token
o3-mini	OpenAI reasoning	0.004 credits/token
gpt-4o	OpenAI multimodal flagship	0.01 credits/token
gpt-4o-mini	Lightweight OpenAI	0.0008 credits/token

Authentication

All API requests require a Bearer token in the Authorization header. Include your API key in all requests:

Authorization: Bearer sk-sage-your-api-key

API keys can be managed through the DevBlock Console. Each request must include a valid API key with sufficient credits for the operation.