Phi-4 Reasoning Vision
Active15B multimodal reasoning model with image understanding.
Overview
Phi-4 Reasoning Vision adds image understanding to the Phi-4 Reasoning model via a SigLIP-2 vision encoder, enabling visual reasoning over charts, diagrams, and screenshots. Released March 2026.
History
Phi-4 Reasoning Vision was released on 2026-03-04.
Training & availability
Weights are publicly available under the MIT license, making this an open-weight model suitable for on-prem deployment and fine-tuning.
Capabilities
-
Context window: 32K tokens.
-
Max output: 16K tokens.
-
Input modalities: text, image.
Recommended for: vision, open-source.
Limitations
- The context window (32K tokens) is modest by 2026 standards — unsuitable for processing long documents in a single request.
Quick start
Minimal example using the OpenRouter API. Copy, paste, replace the key.
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="sk-or-...",
)
resp = client.chat.completions.create(
model="microsoft/phi-4-reasoning-vision",
messages=[{"role": "user", "content": "Explain quantum computing in one sentence."}],
)
print(resp.choices[0].message.content)Cost calculator
Estimate your monthly bill. Presets are typical workload sizes.
Integrations & tooling support
- Tool calling
- Not supported
- Structured outputs
- Not supported
Price vs quality
Priced low — good for high-volume tasks. Quality tier pending more benchmark coverage.
- Quality percentile
- —
- Effective price
- $0.244/1M
- Pricing breakdown
- $0.075/1M in
$0.3/1M out
Community ratings
Rate Phi-4 Reasoning Vision
Sign in to rate and review.
Comments
Sign in to leave a comment.