§06 / models

Supported models in Hosted mode

The plugin model names you can use with a Hosted short code, the upstream provider behind each one, and the per-token price you'll be charged.

§01

Available plugin models

Each row is a plugin model name your client can send (in your Anthropic-compatible request body). Hosted short codes ship with these mappings pre-populated, so the first request just works. Per-token prices are what reAPI bills your wallet on Hosted requests.

Plugin modelUpstreamContextInput / MCached / MOutput / M
claude-opus-4-5legacy
DeepSeek V4 Prodeepseek · deepseek-v4-pro
1M$1.71$0.01$3.43
claude-opus-4-6legacy
DeepSeek V4 Prodeepseek · deepseek-v4-pro
1M$1.71$0.01$3.43
claude-opus-4-7
DeepSeek V4 Prodeepseek · deepseek-v4-pro
1M$1.71$0.01$3.43
claude-sonnet-4-5legacy
DeepSeek V4 Flashdeepseek · deepseek-v4-flash
1M$0.14$0.00$0.29
claude-sonnet-4-6
DeepSeek V4 Flashdeepseek · deepseek-v4-flash
1M$0.14$0.00$0.29

Prices are per 1 million tokens. Cached input applies when an upstream cache hit is reported on the request — typical Office workloads have naturally high cache hit rates. Empty (—) means the model does not support cache pricing.

§02

How pricing works

reAPI bills your wallet for each Hosted request. The bill = input_tokens × input_rate + cached_input_tokens × cached_rate + output_tokens × output_rate, in credits. Wallet balance ($1 = 10,000 credits, ¥1 ≈ 1,471 credits) is debited atomically after the upstream completes.

Cached input tokens are billed at the cached rate when the upstream returns a cache hit count. Anthropic-compatible upstreams (DeepSeek, Qwen) report cache_read_input_tokens on streaming responses; non-streaming responses produce no usage telemetry and are billed as if uncached.