Sage Inference
High-performance LLM inference API. OpenAI-compatible, multi-model, streaming, and multi-modal. Base URL: https://sage-api.devblocktechnologies.com
Overview
Sage Inference provides a unified API for accessing cutting-edge language models from DeepSeek, Anthropic, OpenAI, and more. The API is fully compatible with the OpenAI client SDK, making integration seamless.
OpenAI-compatible API
Drop-in replacement for OpenAI API. Use any existing OpenAI SDK.
Multiple Models
Access 8+ models across providers from a single endpoint.
Streaming
Server-sent events streaming for real-time token delivery.
Multi-modal
Vision, audio, and image generation capabilities.
Quick Start
Send your first chat completion request with curl:
curl https://sage-api.devblocktechnologies.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-sage-your-api-key" \
-d '{
"model": "deepseek-v4-flash",
"messages": [
{"role": "user", "content": "Hello, how are you?"}
],
"temperature": 0.7,
"max_tokens": 256
}'Available Models
| Model ID | Description | Cost |
|---|---|---|
| deepseek-v4-flash | Fast chat | 0.0004 credits/token |
| deepseek-v4-pro | Advanced reasoning/coding | 0.002 credits/token |
| deepseek-reasoner | Deep reasoning | 0.002 credits/token |
| claude-sonnet-4-20250514 | Anthropic's most balanced | 0.006 credits/token |
| claude-haiku-3.5 | Fast Claude | 0.002 credits/token |
| o3-mini | OpenAI reasoning | 0.004 credits/token |
| gpt-4o | OpenAI multimodal flagship | 0.01 credits/token |
| gpt-4o-mini | Lightweight OpenAI | 0.0008 credits/token |
Authentication
All API requests require a Bearer token in the Authorization header. Include your API key in all requests:
Authorization: Bearer sk-sage-your-api-key
API keys can be managed through the DevBlock Console. Each request must include a valid API key with sufficient credits for the operation.