Supported models in Hosted mode
The plugin model names you can use with a Hosted short code, the upstream provider behind each one, and the per-token price you'll be charged.
Available plugin models
Each row is a plugin model name your client can send (in your Anthropic-compatible request body). Hosted short codes ship with these mappings pre-populated, so the first request just works. Per-token prices are what reAPI bills your wallet on Hosted requests.
All active provider models. Per-token prices are shown per 1M tokens, in your wallet's currency.
| Upstream | Context | Input / 1M | Cached / 1M | Output / 1M | Vision |
|---|---|---|---|---|---|
Qwen3.6 Max (Preview)aliyun/qwen3.6-max-preview | 256K | $2.14/M | $2.14/M | $12.86/M | No |
Qwen3.6 Plusaliyun/qwen3.6-plus | 1M | $1.14/M | $1.14/M | $6.86/M | Yes |
Qwen3.7 Maxaliyun/qwen3.7-max | 1M | $1.71/M | $1.71/M | $5.14/M | No |
DeepSeek V4 Flashdeepseek/deepseek-v4-flash | 1M | $0.14/M | $0.00/M | $0.29/M | No |
DeepSeek V4 Prodeepseek/deepseek-v4-pro | 1M | $1.71/M | $0.01/M | $3.43/M | No |
GLM 4.7glm/glm-4.7 | 200K | $0.57/M | $0.11/M | $2.29/M | No |
GLM 5glm/glm-5 | 200K | $0.86/M | $0.21/M | $3.14/M | No |
GLM 5 Turboglm/glm-5-turbo | 200K | $1.00/M | $0.26/M | $3.71/M | No |
GLM 5.1glm/glm-5.1 | 200K | $1.14/M | $0.29/M | $4.00/M | No |
MiMo V2 Flashmimo/mimo-v2-flash | 256K | $0.10/M | $0.10/M | $0.30/M | No |
MiMo V2 Omnimimo/mimo-v2-omni | 256K | $0.40/M | $0.40/M | $2.00/M | Yes |
MiMo V2 Promimo/mimo-v2-pro | 256K | $1.00/M | $1.00/M | $3.00/M | No |
MiMo V2.5mimo/mimo-v2.5 | 1M | $1.00/M | $1.00/M | $3.00/M | Yes |
MiMo V2.5 Promimo/mimo-v2.5-pro | 1M | $1.00/M | $1.00/M | $3.00/M | No |
GPT-5.2openai/gpt-5.2 | 200K | $1.75/M | $0.17/M | $14.00/M | Yes |
GPT-5.3 Codexopenai/gpt-5.3-codex | 200K | $1.75/M | $0.17/M | $14.00/M | Yes |
GPT-5.4openai/gpt-5.4 | 400K | $2.50/M | $0.25/M | $15.00/M | Yes |
GPT-5.4 Miniopenai/gpt-5.4-mini | 200K | $0.75/M | $0.07/M | $4.50/M | Yes |
GPT-5.5openai/gpt-5.5 | 400K | $5.00/M | $0.50/M | $30.00/M | Yes |
Prices are per 1 million tokens. Cached input applies when an upstream cache hit is reported on the request — typical Office workloads have naturally high cache hit rates. Empty (—) means the model does not support cache pricing.
How pricing works
reAPI bills your wallet for each Hosted request. The bill = input_tokens × input_rate + cached_input_tokens × cached_rate + output_tokens × output_rate. Each rate is the per-token price shown in the catalog on the Mappings page, in your wallet's currency. Your balance is debited atomically after the upstream completes.
Cached input tokens are billed at the cached rate when the upstream returns a cache hit count. Anthropic-compatible upstreams report cache_read_input_tokens on streaming responses; non-streaming responses produce no usage telemetry and are not billed (free ride, matching the Your-key mode policy for requests without usage data).