Documentation

Everything you need to integrate NLC AI into your workflow.

Available Models

Seven models across two categories. The five chat models all stream and support tool calling. Both multi-agent tiers (ShipLow + ULTRA MAX) route your request to the right executor automatically. Per-model max output — no global cap, so long answers are supported. NLC Embed and NLC Rerank are RAG helpers.

Model	ID	Context	Max Output	Capabilities	Best For
NLC ShipLow 1.0 · budget · NEW	`nlc-shiplow`	131,072	32,768	Text, Vision, Reasoning, Tools, Multi-Agent, Structured Output	Budget-friendly default. Six-lane router covers code, agent, math, vision, multilingual, and general chat at the lowest price
NLC Fast 1.0 · cheap	`nlc-fast`	1,048,576	900,000	Text, Reasoning, Tools, Long Context	High-throughput coding and agentic workloads
NLC Vision 1.0	`nlc-vision`	262,144	200,000	Text, Vision, Reasoning, Tools, Long Context	Images, screenshots, visual understanding
NLC PRO 1.0 · flagship	`nlc-pro`	1,048,576	1,044,000	Text, Reasoning, Tools, Structured Output, Long Context	Code, deep reasoning, long documents, large codebases
NLC ULTRA MAX 1.0 · premium	`nlc-ultra`	1,048,576	1,044,000	Text, Vision, Reasoning, Tools, Multi-Agent, Long Context	Full-stack features. Router dispatches FRONTEND to a vision model and BACKEND to a deep-reasoning model

Voice (ASR/TTS) is not shipped — no audio models in our catalog. Anthropic Messages, OpenAI Responses, Embeddings, and Rerank are all supported alongside OpenAI Chat — every chat model works with every endpoint.

Quick Selection Guide

Task	Model
Budget-conscious default · tight budget · high volume	`nlc-shiplow`
Quick questions, high-throughput single-model	`nlc-fast`
Image or visual understanding	`nlc-vision`
Code analysis, debugging, long documents, large codebases	`nlc-pro`
Full-stack features (premium multi-agent)	`nlc-ultra`

NLC ShipLow 1.0 — how it works

ShipLow is a six-lane multi-agent. A fast router reads your request and dispatches it to the specialist executor that's strongest at that kind of work:

Lane	Triggers when	What you get
General	chat, Q&A, summaries, explanations	Fast all-rounder with tool-use
Code	single-file code, refactors, bug fixes	Competition-grade code generation
Agent	multi-file or repo-level engineering	Patch-style edits across files
Math	proofs, AIME-style problems, numeric reasoning	Careful step-by-step reasoning
Vision	any image is attached (auto-routed)	Reads screenshots, diagrams, UI mockups
Multilingual	request is primarily Thai / Indic / CJK / Arabic	Natural fluent output in the user's language

The executor streams the final answer back under the nlc-shiplow brand id — you never see which lane ran.

NLC ULTRA MAX 1.0 — how it works

ULTRA MAX is the premium three-lane multi-agent:

Agent 1 (router) classifies your task as FRONTEND, BACKEND, or GENERAL and returns strict JSON.
Router logic picks the executor based on the task type.
Agent 2 (vision-capable model) handles FRONTEND tasks — UI, components, styling.
Agent 3 (deep-reasoning model) handles BACKEND and GENERAL tasks — servers, APIs, databases, reasoning.
The executor streams the final answer back to you under the nlc-ultra brand id — you never see which sub-agent ran.

Pricing is composite: you pay the ULTRA MAX rate (covers router + executor cost in one transparent deduction). 1M context window on the router, so you can upload a whole repo and the router will only pass the relevant parts to the executor.

Quick Start

1. Get your API key

2. Make your first call

curl

curl https://oddsforge.org/v1/chat/completions \
  -H "Authorization: Bearer nlk_your_key" \
  -H "Content-Type: application/json" \
  -d '{"model":"nlc-shiplow","messages":[{"role":"user","content":"Hello!"}]}'

Tip: nlc-shiplow is the budget-friendly default. Swap to nlc-pro for hardest tasks, or nlc-ultra for premium multi-agent.

3. Choose your integration

OpenAI Protocol

Fully OpenAI-compatible. Use any OpenAI SDK or client.

Base URL

https://oddsforge.org/v1

Python

python

from openai import OpenAI

client = OpenAI(base_url="https://oddsforge.org/v1", api_key="nlk_your_key")

response = client.chat.completions.create(
    model="nlc-pro",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)
for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

JavaScript

javascript

import OpenAI from 'openai';
const client = new OpenAI({ baseURL: 'https://oddsforge.org/v1', apiKey: 'nlk_your_key' });

const response = await client.chat.completions.create({
    model: 'nlc-pro',
    messages: [{ role: 'user', content: 'Hello!' }],
    stream: true
});
for await (const chunk of response) {
    process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Endpoints

Endpoint	Method	Description
`/v1/chat/completions`	POST	Chat completions (streaming)
`/v1/completions`	POST	Raw text completion (custom prompt templates)
`/v1/models`	GET	List models

Voice (ASR/TTS) is not shipped. The upstream provider has no audio models. Anthropic protocol IS shipped — see the Anthropic tab.

Anthropic Messages API

Use the Anthropic Python or TypeScript SDK against our brand. Same models, same billing, same NLC key. Base URL is https://oddsforge.org/inference (the SDK appends /v1/messages automatically).

Python

python

import anthropic

client = anthropic.Anthropic(
    api_key="nlk_your_key_here",
    base_url="https://oddsforge.org/inference",
)

response = client.messages.create(
    model="nlc-pro",
    max_tokens=256,
    messages=[{"role": "user", "content": "Say hello in Spanish. Reply in one word."}]
)
print(response.content[0].text)

JavaScript / TypeScript

typescript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: "nlk_your_key_here",
  baseURL: "https://oddsforge.org/inference",
});

const response = await client.messages.create({
  model: "nlc-pro",
  max_tokens: 256,
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.content[0].text);

curl

bash

curl https://oddsforge.org/v1/messages \
  -H "Authorization: Bearer nlk_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nlc-pro",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Compatibility notes: max_tokens is optional on NLC (required on Anthropic). Server-side tool families (code execution, memory, web fetch, web search) are not supported. Tool calling, streaming, structured output, reasoning, and vision all work as expected.

Responses API

The Responses API is the most powerful surface — stateful conversations, advanced tool use (MCP/SSE server tools, client function tools), and continuation via previous_response_id without resending history. Stored by default.

Create a response

python

from openai import OpenAI
client = OpenAI(base_url="https://oddsforge.org/v1", api_key="nlk_your_key")

response = client.responses.create(
    model="nlc-pro",
    input="What's the capital of France?"
)
print(response.output[-1].content[0].text)

Continue with previous_response_id

python

first = client.responses.create(model="nlc-pro", input="Tell me a joke")
second = client.responses.create(
    model="nlc-pro",
    input="Tell me another one",
    previous_response_id=first.id   # continues the conversation
)

Stream

python

stream = client.responses.create(
    model="nlc-pro",
    input="Write a haiku about the ocean.",
    stream=True
)
for chunk in stream:
    print(chunk)

Function tools

python

response = client.responses.create(
    model="nlc-pro",
    input="What's the weather in Tokyo?",
    tools=[{
        "type": "function",
        "name": "get_weather",
        "description": "Get current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"]
        }
    }],
    tool_choice="auto"
)

Set store: false to skip storage (then previous_response_id won't work for that response). Listing and deleting stored responses is not supported on NLC — those operations are account-scoped upstream and would expose other users' data, so we disable them. Save response ids on your side if you need to track them.

Embeddings & Rerank

Two purpose-built models for RAG and semantic search. NLC Embed 1.0 generates vector embeddings. NLC Rerank 1.0 reranks retrieved documents against a query for better final ordering.

Generate embeddings

python

from openai import OpenAI
client = OpenAI(base_url="https://oddsforge.org/v1", api_key="nlk_your_key")

response = client.embeddings.create(
    model="nlc-embed",
    input="The quick brown fox jumps over the lazy dog",
    dimensions=128   # optional — variable-length embeddings
)
print(response.data[0].embedding[:8])

Batch embeddings

python

response = client.embeddings.create(
    model="nlc-embed",
    input=["first document", "second document", "third document"]
)
for item in response.data:
    print(item.index, len(item.embedding))

Rerank documents

python

import requests
r = requests.post("https://oddsforge.org/v1/rerank",
    headers={"Authorization": "Bearer nlk_your_key"},
    json={
        "model": "nlc-rerank",
        "query": "What is the capital of France?",
        "documents": [
            "Paris is the capital of France.",
            "France is in Western Europe.",
            "Python is a programming language."
        ],
        "top_n": 2,
        "return_documents": True
    })
for result in r.json()["results"]:
    print(result["index"], result["relevance_score"], result["document"]["text"][:60])

Pricing: NLC Embed 1.0 and NLC Rerank 1.0 are both $0.20 / 1M input tokens (no output cost). Charged per actual upstream token usage.

VS Code Integration

Continue.dev (Free, Recommended)

Install the Continue extension, then edit ~/.continue/config.yaml:

yaml

models:
  - name: NLC PRO 1.0
    provider: openai
    model: nlc-pro
    apiBase: https://oddsforge.org/v1
    apiKey: nlk_your_key
    roles: [chat, edit]
  - name: NLC Fast 1.0
    provider: openai
    model: nlc-fast
    apiBase: https://oddsforge.org/v1
    apiKey: nlk_your_key
    roles: [chat]

GitHub Copilot Chat

Open Copilot Chat, click model selector (top-right)
Manage Models, Add Model, "OpenAI Compatible"
Base URL: https://oddsforge.org/v1
API Key: your nlk_ key
Model: nlc-pro

Cline / Roo

Provider: OpenAI Compatible · Base URL: https://oddsforge.org/v1 · Key: your nlk_ key · Model: nlc-shiplow (budget) or nlc-pro (flagship).

Cursor (recommended for IDE coding)

Open Cursor → Settings (⌘,) → Models.
Disable every default provider (OpenAI, Anthropic, Google) so Cursor only uses NLC.
Scroll to OpenAI API Key section → click Override OpenAI Base URL → paste https://oddsforge.org/v1
OpenAI API Key field → paste your nlk_ key (from Dashboard).
Click Verify — should succeed.
Under Models, click + Add model and add these custom ids one by one:
- nlc-shiplow · cheapest, multi-agent, vision · recommended default
- nlc-pro · flagship reasoning + code
- nlc-fast · cheap single-model, 1M context
- nlc-vision · multimodal
- nlc-ultra · premium multi-agent
Pick the one you want as the chat model in the top-right model dropdown.

Which model on Cursor?

· Right now (during the Fireworks-account top-up) → use nlc-shiplow. It runs on our free upstream, costs $0.30/$0.60 per 1M to your account, and handles code/vision/MCP.
· Maximum code quality (when budget allows) → nlc-pro · 1M context, deep reasoning.
· Multi-file refactors / agentic features → nlc-ultra · premium multi-agent router.

Which endpoint does Cursor actually call?

· When you paste a Base URL under OpenAI, Cursor calls /chat/completions automatically. You don't pick the path — just the Base URL.
· If you switch the provider to Anthropic in Cursor's settings and paste the Base URL there, Cursor calls /messages instead. Both endpoints work with every NLC model — same key, same models, same billing.
· You never need to type a path. Just give Cursor the Base URL + your nlk_ key + a custom model id.

CLI / Terminal

curl

bash

curl -N https://oddsforge.org/v1/chat/completions \
  -H "Authorization: Bearer nlk_your_key" \
  -H "Content-Type: application/json" \
  -d '{"model":"nlc-pro","messages":[{"role":"user","content":"Hello!"}],"stream":true}'

NLC CLI

bash

npm install -g nlc-ai
nlc login https://oddsforge.org
nlc

Account management

Manage your API keys, credits, teams, webhooks, and usage from the dashboard or via the API. All endpoints are scoped to your account — one user can never see another user's data.

API keys (multi-key)

Each account starts with one auto-generated legacy key (nlk_…) shown on the dashboard. You can also create multiple named keys (e.g. one for VS Code, one for CI, one for prod) and rotate or disable them individually.

Endpoint	Method	Use
`/apikeys/list`	POST	List your keys (prefix only — full key shown once at create time)
`/apikeys/create`	POST	Create a new named key — body: `{name, rpmLimit?}`
`/apikeys/rotate`	POST	Rotate a key — old key stops working immediately, returns the new full key
`/apikeys/delete`	POST	Delete a key by id
`/apikeys/toggle`	POST	Enable/disable a key — body: `{id, enabled}`

python

from openai import OpenAI
# Use any of your keys interchangeably
client = OpenAI(base_url="https://oddsforge.org/v1", api_key="nlk_your_key_here")

Credits & top-up

Credits are deducted per actual upstream token usage, per the model's rate. Top up any amount ≥ $5 by card via Stripe — credits never expire. Bonus tiers kick in at $10 (+10%), $25 (+20%), $50 (+30%), $100 (+40%).

Endpoint	Method	Use
`/me`	GET	Your email, balance, api key, referral info
`/billing/packs`	GET	List suggested top-up packs + min/max amounts
`/billing/quote`	POST	Quote a checkout (credits + bonus for a given usd or pack)
`/billing/checkout`	POST	Create a Stripe Checkout session — body: `{pack}` or `{usd}`
`/billing/confirm`	POST	Confirm a checkout session by id (idempotent credit grant)

Sessions (chat history)

Each conversation is stored server-side under your account. You can list, load, save, clear, delete one, or delete all.

Endpoint	Method	Use
`/session/list`	GET	List your saved sessions (id + name + updatedAt)
`/session/load`	POST	Load a session's messages by id
`/session/save`	POST	Save a session (upsert) — body: `{id, messages, name?}`
`/session/clear`	POST	Empty a session's messages but keep the entry
`/session/delete`	POST	Delete one session by id
`/session/delete-all`	POST	Delete every session for your account

Teams (credit sharing)

Create a team, add members (owner/admin/member roles), share a credit pool. Only team members can list members (no IDOR).

Endpoint	Method	Use
`/teams/list`	POST	Teams you belong to (with your role)
`/teams/create`	POST	Create a team — you become owner
`/teams/members`	POST	List members (only if you're a member)
`/teams/add-member`	POST	Add a member (owner/admin only)
`/teams/remove-member`	POST	Remove a member (role-tiered)
`/teams/delete`	POST	Delete a team (owner only)

Webhooks

Register a webhook URL and we'll POST to it on events: credits_low, payment. Signed with x-nlc-signature (HMAC-SHA256). Webhook URLs must be public http(s) — localhost, private IPs, link-local, and cloud-metadata endpoints are blocked (SSRF guard).

Endpoint	Method	Use
`/webhooks/list`	POST	Your webhooks
`/webhooks/create`	POST	Register — body: `{url, events}`
`/webhooks/delete`	POST	Delete by id
`/webhooks/toggle`	POST	Enable/disable

Usage

See your own usage history (totals + daily breakdown + by-kind) for the last N days.

Endpoint	Method	Use
`/usage/summary?days=30`	GET	Totals, by-kind, daily breakdown
`/usage/recent`	POST	Recent events — body: `{limit}`

Referrals

Every account has a referral code. Share your link https://oddsforge.org/chat?ref=NGCXXXX — when a friend tops up ≥ $5, you earn bonus credits (capped at 20 rewarded friends to protect margin). See your referral info via /me.

Security & privacy

Authentication

Passwords are scrypt-hashed with per-user salt. Verification uses timingSafeEqual (constant-time).
API keys are scrypt-hashed at rest — the full key is shown once at creation, never retrievable.
Session tokens (nlt_…) are 24 random bytes, expire after 30 days, stored in tokens table.
Owner login uses a separate master password (NLC_PASSWORD), rate-limited 10 attempts / 5 min per IP.
Auth endpoints (login, register, google, reset) are all per-IP rate-limited.

Authorization

Owner-only: every /admin/* endpoint checks email === __owner__ and returns 403 otherwise.
Per-user scoping: API keys, sessions, webhooks, teams, and usage are all queried with WHERE email = ? — your key can only read your data.
Team membership check: /teams/members refuses to list a team's roster unless you're a member (no IDOR).
Responses API: list & delete are disabled because the upstream provider stores responses at the account level — enabling them would expose every other user's stored responses. Create with store: false or save response ids yourself.
Webhook URL validation: must be http(s), can't point at localhost / private IPs / link-local / cloud-metadata endpoints (SSRF guard).

Data privacy

Per-account isolation: every user's sessions, memory, teams, and webhooks live in their own row set.
No training on your data: conversations are never fed back into a model.
Deletable: any session, all sessions, any webhook, any API key — instant, permanent.
Upstream id masking: every chat/completions / messages / responses chunk is scrubbed — customers never see any upstream model id or provider name, only NLC brand ids.
Error sanitization: error messages are scrubbed of provider names, API keys, and upstream URLs before reaching the client.

HTTP security headers

X-Content-Type-Options: nosniff
X-Frame-Options: DENY — no clickjacking
Strict-Transport-Security: max-age=31536000; includeSubDomains — HTTPS only
Referrer-Policy: strict-origin-when-cross-origin
Content-Security-Policy — script-src restricted to self + accounts.google.com
Permissions-Policy — camera, geolocation disabled

Admin audit log

Every privileged owner action (login, credit adjustment, bypass toggle) is recorded in admin_audit with timestamp, actor, action, target, amount, IP, and detail. Visible to the owner in the admin panel.

SQL injection

Every query uses parameterized placeholders (? / $1). No string concatenation of user input into SQL. The DB layer (src/server/db.ts) is the single chokepoint for both PostgreSQL and SQLite.

Rate limits

Chat / completions / responses / messages / embeddings / rerank: 60 req/min per user.
Image generation: 10 req/min per user.
Auth endpoints: 5–10 attempts per 5–10 min per IP.
Per-API-key RPM limits can be set at key creation.

Error codes

All errors are returned as {error: string} JSON (or the OpenAI shape {error: {message, code, type}} for chat completions). Upstream provider names and ids are scrubbed before they reach the client.

Status	Meaning	Example
400	Bad request — missing or invalid field	{"error":"provide either `pack` or `usd`"}
401	Missing or invalid auth token	`{"error":"unauthorized"}`
402	Out of credits — top up at the dashboard	`{"error":"out of credits - please top up at the dashboard"}`
403	Forbidden — not the owner / not a team member	`{"error":"forbidden"}`, `{"error":"you are not a member of this team"}`
404	Unknown endpoint or resource	`{"error":"not found"}`
405	Method not allowed	empty body
415	Unsupported content type	`{"error":"unsupported content type"}`
429	Rate limit exceeded — retry in a minute	`{"error":"rate limit exceeded — try again in a minute"}`
500	Internal server error (scrubbed)	Renders `500.html`
501	Feature not configured (image gen, AI upstream)	`{"error":"AI not configured"}`
503	Service unavailable (Stripe webhook not configured, etc.)	`{"error":"webhook not configured"}`

Reliability tips

Stream responses (stream: true) so a network blip doesn't lose the whole answer.
Retry on 429 with exponential backoff (1s, 2s, 4s).
Handle 402 by redirecting the user to the dashboard to top up.
On 401, clear the local token and prompt for sign-in.
For long-running batch work, poll until done and cap your max_tokens per request.

Pricing

Pay-as-you-go. No subscription. Credits never expire. Per-model rates — you only pay for what each model actually costs.

Per-model rates (per 1K tokens, ordered cheapest → premium)

Model	Input (cr/1K)	Output (cr/1K)	USD / 1M in	USD / 1M out	Notes
NLC ShipLow 1.0 BUDGET	0.30	0.60	$0.30	$0.60	Six-lane fusion · vision · image gen · web vision
NLC Fast 1.0	0.45	0.90	$0.45	$0.90	Cheap single-model, 1M context
NLC Vision 1.0	1.80	5.50	$1.80	$5.50	Multimodal, 256K context
NLC PRO 1.0	3.20	9.80	$3.20	$9.80	Flagship reasoning + code, 1M context
NLC ULTRA MAX 1.0 PREMIUM	3.50	11.00	$3.50	$11.00	Premium multi-agent router
NLC Embed 1.0	0.20	—	$0.20	—	Embeddings for RAG
NLC Rerank 1.0	0.20	—	$0.20	—	Document reranking

Image generation — free when used with nlc-shiplow; for other models a flat per-image rate applies (see dashboard). Anthropic protocol — shipped at /v1/messages with every chat model. Voice (ASR/TTS) — not shipped yet.

Top-up — custom amount

You're not forced to pick a fixed pack. Enter any amount ≥ $5 in the dashboard and pay exactly that. Bonus credits kick in for larger amounts.

Amount	Base credits	Bonus	Total credits
$5	5,000	—	5,000
$10	10,000	+10%	11,000
$25	25,000	+20%	30,000
$50	50,000	+30%	65,000
$100	100,000	+40%	140,000
any amount ≥ $5	Base + tier-based bonus — see the dashboard for a live quote before paying.

1 credit = $0.001. New accounts get 100 free credits. Larger amounts earn more credits per dollar (10% at $10+, 20% at $25+, 30% at $50+, 40% at $100+).