Supported models in Hosted mode

The plugin model names you can use with a Hosted short code, the upstream provider behind each one, and the per-token price you'll be charged.

§01

Available plugin models

Each row is a plugin model name your client can send (in your Anthropic-compatible request body). Hosted short codes ship with these mappings pre-populated, so the first request just works. Per-token prices are what reAPI bills your wallet on Hosted requests.

Pricing catalog

All active provider models. Per-token prices are shown per 1M tokens, in your wallet's currency.

UpstreamContextInput / 1MCached / 1MOutput / 1MVision
Qwen3.6 Max (Preview)aliyun/qwen3.6-max-preview
256K$2.14/M$2.14/M$12.86/MNo
Qwen3.6 Plusaliyun/qwen3.6-plus
1M$1.14/M$1.14/M$6.86/MYes
Qwen3.7 Maxaliyun/qwen3.7-max
1M$1.71/M$1.71/M$5.14/MNo
DeepSeek V4 Flashdeepseek/deepseek-v4-flash
1M$0.14/M$0.00/M$0.29/MNo
DeepSeek V4 Prodeepseek/deepseek-v4-pro
1M$1.71/M$0.01/M$3.43/MNo
GLM 4.7glm/glm-4.7
200K$0.57/M$0.11/M$2.29/MNo
GLM 5glm/glm-5
200K$0.86/M$0.21/M$3.14/MNo
GLM 5 Turboglm/glm-5-turbo
200K$1.00/M$0.26/M$3.71/MNo
GLM 5.1glm/glm-5.1
200K$1.14/M$0.29/M$4.00/MNo
MiMo V2 Flashmimo/mimo-v2-flash
256K$0.10/M$0.10/M$0.30/MNo
MiMo V2 Omnimimo/mimo-v2-omni
256K$0.40/M$0.40/M$2.00/MYes
MiMo V2 Promimo/mimo-v2-pro
256K$1.00/M$1.00/M$3.00/MNo
MiMo V2.5mimo/mimo-v2.5
1M$1.00/M$1.00/M$3.00/MYes
MiMo V2.5 Promimo/mimo-v2.5-pro
1M$1.00/M$1.00/M$3.00/MNo
GPT-5.2openai/gpt-5.2
200K$1.75/M$0.17/M$14.00/MYes
GPT-5.3 Codexopenai/gpt-5.3-codex
200K$1.75/M$0.17/M$14.00/MYes
GPT-5.4openai/gpt-5.4
400K$2.50/M$0.25/M$15.00/MYes
GPT-5.4 Miniopenai/gpt-5.4-mini
200K$0.75/M$0.07/M$4.50/MYes
GPT-5.5openai/gpt-5.5
400K$5.00/M$0.50/M$30.00/MYes

Prices are per 1 million tokens. Cached input applies when an upstream cache hit is reported on the request — typical Office workloads have naturally high cache hit rates. Empty (—) means the model does not support cache pricing.

§02

How pricing works

reAPI bills your wallet for each Hosted request. The bill = input_tokens × input_rate + cached_input_tokens × cached_rate + output_tokens × output_rate. Each rate is the per-token price shown in the catalog on the Mappings page, in your wallet's currency. Your balance is debited atomically after the upstream completes.

Cached input tokens are billed at the cached rate when the upstream returns a cache hit count. Anthropic-compatible upstreams report cache_read_input_tokens on streaming responses; non-streaming responses produce no usage telemetry and are not billed (free ride, matching the Your-key mode policy for requests without usage data).