Full API documentation with code examples. NLC ships four protocols: OpenAI Chat, OpenAI Completions, OpenAI Responses, and Anthropic Messages — all under our brand, all billed in credits.
Start here · Cursor / VS Code / OpenAI SDK
| Base URL | https://oddsforge.org/v1 |
| Endpoint | /chat/completions (or /messages for Anthropic SDK) |
| Auth header | Authorization: Bearer nlk_… (from Dashboard) |
| Recommended model | nlc-shiplow · budget multi-agent · live default |
| Other models | nlc-fast · nlc-vision · nlc-pro · nlc-ultra |
Cursor setup: Settings → Models → Override OpenAI Base URL → paste the Base URL above → paste your nlk_ key → Verify → add custom model id nlc-shiplow. See full guide.
Which endpoint does Cursor call? Cursor uses /chat/completions automatically — you don't pick the path, just the Base URL. If you switch the provider to Anthropic in Cursor's settings, it calls /messages instead. Both work with every NLC model.
| Endpoint | Method | Protocol | Use |
|---|---|---|---|
/v1/chat/completions | POST | OpenAI | Chat (streaming + tools) |
/v1/completions | POST | OpenAI | Raw text completion (custom prompt templates) |
/v1/responses | POST | OpenAI Responses | Stateful conversations + tool use (previous_response_id) |
/v1/messages | POST | Anthropic | Anthropic SDK compatible |
/v1/embeddings | POST | OpenAI | Embeddings for RAG / semantic search |
/v1/rerank | POST | NLC | Document reranking for second-stage RAG |
/v1/models | GET | OpenAI | List chat models (brand ids only) |
/v1/available-models | GET | NLC | Full catalog with capabilities + pricing |
/v1/images/generations | POST | OpenAI | Image generation (when configured) |
Not every model works on every endpoint. Here's the full compatibility matrix — and the difference between each protocol so you know which one to pick.
| Endpoint | PRO | Vision | Fast | ULTRA | Embed | Rerank | Best for |
|---|---|---|---|---|---|---|---|
/v1/chat/completions |
✓ | ✓ | ✓ | ✓ | — | — | Chat, code, vision, multi-agent — the main endpoint |
/v1/completions |
✓ | ✓ | ✓ | ✓ | — | — | Raw text generation, custom prompt templates, base models |
/v1/responses |
✓ | ✓ | ✓ | ✓ | — | — | Stateful conversations (previous_response_id), MCP tools |
/v1/messages |
✓ | ✓ | ✓ | ✓ | — | — | Anthropic SDK (Claude Code, Anthropic Python/JS SDK) |
/v1/embeddings |
— | — | — | — | ✓ | — | Vector embeddings for RAG / semantic search |
/v1/rerank |
— | — | — | — | — | ✓ | Rerank documents for second-stage RAG retrieval |
OpenAI Chat Completions — /v1/chat/completions
The standard endpoint for chat. Send messages (system + user + assistant), get a response. Supports streaming, tool calling, vision (image_url), and structured output (JSON schema). Use this with Cursor, VS Code (Continue, Copilot, Cline), any OpenAI SDK. Works with all 5 chat models (ShipLow, Fast, Vision, PRO, ULTRA MAX).
OpenAI Completions — /v1/completions
Raw text completion — you provide a prompt string (not messages), and the model continues it. Use this when you need custom prompt formatting (few-shot, base models, legacy apps). Same models as Chat Completions.
OpenAI Responses — /v1/responses
Stateful conversations — the server stores your conversation, so you can continue with previous_response_id without resending the full history. Supports MCP/SSE server tools and client function tools. Use this for multi-turn apps where you don't want to manage message history yourself. Same models as Chat Completions.
Anthropic Messages — /v1/messages
For Claude Code, Anthropic Python/JS SDK. Same models, same billing, same NLC key — just a different protocol shape. Base URL is https://oddsforge.org/inference (the SDK appends /v1/messages). Same models as Chat Completions.
Embeddings — /v1/embeddings
Convert text to vectors for RAG / semantic search. Only works with NLC Embed 1.0. Supports variable-length output via the dimensions parameter.
Rerank — /v1/rerank
Rerank a list of documents against a query — second-stage RAG retrieval. Only works with NLC Rerank 1.0. Send a query + array of documents, get back relevance scores.
Get your API key from the Dashboard. Pass it as a Bearer token on every request:
Authorization: Bearer nlk_xxxxxxxxxxxxxxxx
For the Anthropic SDK you can also use the x-api-key header (Anthropic convention) — both work.
The main endpoint for chat. Supports streaming, tool calling, structured output, and vision (image_url content parts).
POST https://oddsforge.org/v1/chat/completions
{
"model": "nlc-shiplow",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
"stream": true,
"max_tokens": 4096
}
from openai import OpenAI
client = OpenAI(base_url="https://oddsforge.org/v1", api_key="nlk_your_key_here")
response = client.chat.completions.create(
model="nlc-shiplow", # budget default · vision + code + multilingual
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
import OpenAI from 'openai';
const client = new OpenAI({ baseURL: 'https://oddsforge.org/v1', apiKey: 'nlk_your_key_here' });
const response = await client.chat.completions.create({
model: 'nlc-shiplow', // budget default · vision + code + multilingual
messages: [{ role: 'user', content: 'Hello!' }]
});
console.log(response.choices[0].message.content);
Use nlc-vision with image_url content parts (URL or base64 data URI).
response = client.chat.completions.create(
model="nlc-vision",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/cat.jpg"}}
]
}]
)
response = client.chat.completions.create(
model="nlc-pro",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=[{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"]
}
}
}],
tool_choice="auto"
)
response = client.chat.completions.create(
model="nlc-pro",
messages=[{"role": "user", "content": "Extract: Alice, 30, engineer"}],
response_format={"type": "json_schema", "json_schema": {
"name": "person",
"schema": {
"type": "object",
"properties": {"name": {"type": "string"}, "age": {"type": "number"}, "job": {"type": "string"}},
"required": ["name", "age", "job"]
}
}}
)
For raw text generation with custom prompt templates. Use this when you need full control over prompt formatting (base models, few-shot, legacy).
POST https://oddsforge.org/v1/completions
{
"model": "nlc-fast",
"prompt": "Once upon a time",
"max_tokens": 100,
"temperature": 0.7
}
response = client.completions.create(
model="nlc-fast",
prompt="Task: classify sentiment.\nText: I love it.\nSentiment:"
)
Stateful conversations and advanced tool use. Continue chats with previous_response_id without resending the full history. Supports MCP/SSE server tools and client function tools. Stored by default — set store: false to opt out.
POST https://oddsforge.org/v1/responses
{
"model": "nlc-pro",
"input": "What's the capital of France?",
"max_output_tokens": 200
}
first = client.responses.create(model="nlc-pro", input="Tell me a joke")
second = client.responses.create(
model="nlc-pro",
input="Tell me another one",
previous_response_id=first.id # continues the conversation
)
Responses are stored upstream so previous_response_id works. Listing and deleting stored responses is not supported on NLC (those operations are account-scoped upstream, so we disable them to keep user data isolated). Set store: false on create to skip storage, or save the response id on your side.
Use the Anthropic Python or TypeScript SDK against our brand. Base URL is https://oddsforge.org/inference (the SDK appends /v1/messages).
POST https://oddsforge.org/v1/messages
{
"model": "nlc-pro",
"max_tokens": 256,
"messages": [{"role": "user", "content": "Say hello in Spanish. Reply in one word."}]
}
import anthropic
client = anthropic.Anthropic(
api_key="nlk_your_key_here",
base_url="https://oddsforge.org/inference",
)
response = client.messages.create(
model="nlc-pro",
max_tokens=256,
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.content[0].text)
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({
apiKey: "nlk_your_key_here",
baseURL: "https://oddsforge.org/inference",
});
const response = await client.messages.create({
model: "nlc-pro",
max_tokens: 256,
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.content[0].text);
Note: max_tokens is optional on NLC (required on Anthropic). Server-side tool families (code execution, memory, web fetch, web search) are not supported. Tool calling, streaming, structured output, reasoning, and vision work as expected.
Generate vector embeddings for RAG and semantic search. NLC Embed 1.0 supports variable-length output via the dimensions parameter.
POST https://oddsforge.org/v1/embeddings
{
"model": "nlc-embed",
"input": "The quick brown fox jumps over the lazy dog"
}
response = client.embeddings.create(
model="nlc-embed",
input="The quick brown fox jumps over the lazy dog",
dimensions=128 # optional — variable-length embeddings
)
print(response.data[0].embedding[:8])
Rerank a list of documents against a query. Use NLC Rerank 1.0 after a first-stage vector search to boost retrieval quality.
POST https://oddsforge.org/v1/rerank
{
"model": "nlc-rerank",
"query": "What is the capital of France?",
"documents": [
"Paris is the capital of France.",
"France is in Western Europe.",
"Python is a programming language."
],
"top_n": 2,
"return_documents": true
}
import requests
r = requests.post("https://oddsforge.org/v1/rerank",
headers={"Authorization": "Bearer nlk_your_key_here"},
json={
"model": "nlc-rerank",
"query": "capital of France",
"documents": ["Paris is the capital of France.", "Bananas are yellow."],
"top_n": 1
})
print(r.json()["results"])
All five chat models support streaming and tool calling. The Fusion Router auto-selects the best model, or pass the model id explicitly. Per-model context/output limits — no global cap, so long answers are supported. ShipLow (budget) and ULTRA MAX (premium) are multi-agent pipelines that route your request to the right executor for the task.
| Brand | Model ID | Context | Max Output | Best For |
|---|---|---|---|---|
| NLC ShipLow 1.0 BUDGET | nlc-shiplow | 131K | 32K | Budget default — six-lane multi-agent (code/agent/math/vision/multilingual/general) |
| NLC Fast 1.0 | nlc-fast | 1M | 900K | Cheap single-model, high-throughput coding |
| NLC Vision 1.0 | nlc-vision | 256K | 200K | Images, visual understanding |
| NLC PRO 1.0 | nlc-pro | 1M | 1.04M | Code, reasoning, long docs, large codebases (flagship) |
| NLC ULTRA MAX 1.0 | nlc-ultra | 1M | 1.04M | Full-stack features (premium multi-agent router) |
| NLC Embed 1.0 | nlc-embed | 8K | — | Embeddings for RAG / semantic search |
| NLC Rerank 1.0 | nlc-rerank | 32K | — | Document reranking (second-stage RAG) |
1 credit = $0.001. Charged per actual upstream token usage returned in usage. Cached input tokens billed at a discount.
| Model | Input | Output | Credits / 1K in | Credits / 1K out |
|---|---|---|---|---|
| NLC ShipLow | $0.30 | $0.60 | 0.30 | 0.60 |
| NLC Fast | $0.45 | $0.90 | 0.45 | 0.90 |
| NLC Vision | $1.80 | $5.50 | 1.80 | 5.50 |
| NLC PRO | $3.20 | $9.80 | 3.20 | 9.80 |
| NLC ULTRA MAX | $3.50 | $11.00 | 3.50 | 11.00 |
| NLC Embed | $0.20 | — | 0.20 | — |
| NLC Rerank | $0.20 | — | 0.20 | — |
stream: true (SSE).tool_choice: auto on all chat models.response_format: json_schema or json_object.image_url content parts (NLC Vision 1.0, NLC ULTRA MAX 1.0).usage.prompt_tokens_details.cached_tokens).previous_response_id without resending history.60 requests per minute per user for chat / completions / responses / messages / embeddings / rerank. 10 per minute for image generation.
{"error": "out of credits - please top up at the dashboard"} // 402
{"error": "unauthorized"} // 401
{"error": "rate limit exceeded — try again in a minute"} // 429
{"error": "please verify your email first"} // 403
Upstream errors are forwarded with provider names redacted — customers never see any third-party provider or model name in error messages, only NLC brand names.