Qwen3.5-Flash — Alibaba | Modeldex

Qwen3.5-Flash

Active

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

Updated 4 days agoStructured data from Modeldex catalog

VisionLong contextBudget

API release

Feb 25, 2026(last month)

Not enough benchmark coverage yet for an Intelligence Index — needs at least 3 results across 2 categories.

Overview

History

Qwen3.5-Flash became available via the Alibaba API on 2026-02-25.

Training & availability

Alibaba has not released the underlying model weights — access is via their hosted API only.

Capabilities

Context window: 1.0M tokens.
Input modalities: text, image, video.

Recommended for: vision, long-context, cheap.

Pricing

Input: $0.0650 per 1M tokens
Output: $0.2600 per 1M tokens

Use the cost calculator above to estimate monthly spend for your workload.

Quick start

Minimal example using the OpenRouter API. Copy, paste, replace the key.

from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-...",
)
resp = client.chat.completions.create(
    model="alibaba/qwen3-5-flash-02-23",
    messages=[{"role": "user", "content": "Explain quantum computing in one sentence."}],
)
print(resp.choices[0].message.content)

Cost calculator

Estimate your monthly bill. Presets are typical workload sizes.

Input tokens / month5.0M

@ $0.065/1M

Output tokens / month2.0M

@ $0.26/1M

Input cost

$0.325

5.0M × $0.065/1M

Output cost

$0.52

2.0M × $0.26/1M

Total / month

$0.845

$10.14 / year

Integrations & tooling support

Tool calling: Not supported
Structured outputs: Not supported

Price vs quality

Budget pricing

Priced low — good for high-volume tasks. Quality tier pending more benchmark coverage.