Supported models in Hosted mode
The plugin model names you can use with a Hosted short code, the upstream provider behind each one, and the per-token price you'll be charged.
Available plugin models
Each row is a plugin model name your client can send (in your Anthropic-compatible request body). Hosted short codes ship with these mappings pre-populated, so the first request just works. Per-token prices are what reAPI bills your wallet on Hosted requests.
| Plugin model | Upstream | Context | Input / M | Cached / M | Output / M |
|---|---|---|---|---|---|
claude-opus-4-5legacy | DeepSeek V4 Prodeepseek · deepseek-v4-pro | 1M | $1.71 | $0.01 | $3.43 |
claude-opus-4-6legacy | DeepSeek V4 Prodeepseek · deepseek-v4-pro | 1M | $1.71 | $0.01 | $3.43 |
claude-opus-4-7 | DeepSeek V4 Prodeepseek · deepseek-v4-pro | 1M | $1.71 | $0.01 | $3.43 |
claude-sonnet-4-5legacy | DeepSeek V4 Flashdeepseek · deepseek-v4-flash | 1M | $0.14 | $0.00 | $0.29 |
claude-sonnet-4-6 | DeepSeek V4 Flashdeepseek · deepseek-v4-flash | 1M | $0.14 | $0.00 | $0.29 |
Prices are per 1 million tokens. Cached input applies when an upstream cache hit is reported on the request — typical Office workloads have naturally high cache hit rates. Empty (—) means the model does not support cache pricing.
How pricing works
reAPI bills your wallet for each Hosted request. The bill = input_tokens × input_rate + cached_input_tokens × cached_rate + output_tokens × output_rate, in credits. Wallet balance ($1 = 10,000 credits, ¥1 ≈ 1,471 credits) is debited atomically after the upstream completes.
Cached input tokens are billed at the cached rate when the upstream returns a cache hit count. Anthropic-compatible upstreams (DeepSeek, Qwen) report cache_read_input_tokens on streaming responses; non-streaming responses produce no usage telemetry and are billed as if uncached.