Supported models in Hosted mode

The plugin model names you can use with a Hosted short code, the upstream provider behind each one, and the per-token price you'll be charged.

§01

Available plugin models

Each row is a plugin model name your client can send (in your Anthropic-compatible request body). Hosted short codes ship with these mappings pre-populated, so the first request just works. Per-token prices are what reAPI bills your wallet on Hosted requests.

Pricing catalog

All active provider models. Per-token prices are shown per 1M tokens, in your wallet's currency.

Upstream	Context	Input / 1M	Cached / 1M	Output / 1M	Vision
Qwen3.6 Max (Preview)aliyun/qwen3.6-max-preview	256K	$2.14/M	$2.14/M	$12.86/M	No
Qwen3.6 Plusaliyun/qwen3.6-plus	1M	$1.14/M	$1.14/M	$6.86/M	Yes
Qwen3.7 Maxaliyun/qwen3.7-max	1M	$1.71/M	$1.71/M	$5.14/M	No
DeepSeek V4 Flashdeepseek/deepseek-v4-flash	1M	$0.14/M	$0.00/M	$0.29/M	No
DeepSeek V4 Prodeepseek/deepseek-v4-pro	1M	$1.71/M	$0.01/M	$3.43/M	No
GLM 4.7glm/glm-4.7	200K	$0.57/M	$0.11/M	$2.29/M	No
GLM 5glm/glm-5	200K	$0.86/M	$0.21/M	$3.14/M	No
GLM 5 Turboglm/glm-5-turbo	200K	$1.00/M	$0.26/M	$3.71/M	No
GLM 5.1glm/glm-5.1	200K	$1.14/M	$0.29/M	$4.00/M	No
MiMo V2 Flashmimo/mimo-v2-flash	256K	$0.10/M	$0.10/M	$0.30/M	No
MiMo V2 Omnimimo/mimo-v2-omni	256K	$0.40/M	$0.40/M	$2.00/M	Yes
MiMo V2 Promimo/mimo-v2-pro	256K	$1.00/M	$1.00/M	$3.00/M	No
MiMo V2.5mimo/mimo-v2.5	1M	$1.00/M	$1.00/M	$3.00/M	Yes
MiMo V2.5 Promimo/mimo-v2.5-pro	1M	$1.00/M	$1.00/M	$3.00/M	No
GPT-5.2openai/gpt-5.2	200K	$1.75/M	$0.17/M	$14.00/M	Yes
GPT-5.3 Codexopenai/gpt-5.3-codex	200K	$1.75/M	$0.17/M	$14.00/M	Yes
GPT-5.4openai/gpt-5.4	400K	$2.50/M	$0.25/M	$15.00/M	Yes
GPT-5.4 Miniopenai/gpt-5.4-mini	200K	$0.75/M	$0.07/M	$4.50/M	Yes
GPT-5.5openai/gpt-5.5	400K	$5.00/M	$0.50/M	$30.00/M	Yes

Prices are per 1 million tokens. Cached input applies when an upstream cache hit is reported on the request — typical Office workloads have naturally high cache hit rates. Empty (—) means the model does not support cache pricing.

§02

How pricing works

reAPI bills your wallet for each Hosted request. The bill = input_tokens × input_rate + cached_input_tokens × cached_rate + output_tokens × output_rate. Each rate is the per-token price shown in the catalog on the Mappings page, in your wallet's currency. Your balance is debited atomically after the upstream completes.

Cached input tokens are billed at the cached rate when the upstream returns a cache hit count. Anthropic-compatible upstreams report cache_read_input_tokens on streaming responses; non-streaming responses produce no usage telemetry and are not billed (free ride, matching the Your-key mode policy for requests without usage data).

Available plugin models

How pricing works

Related docs