Qwen3-Max Thinking Outperforms Gemini 3 Pro and GPT-5.2 in Reasoning Exams

venturebeat.com

Qwen3-Max Thinking Outperforms Gemini 3 Pro and GPT-5.2 in Reasoning Exams

TL;DR

The new reasoning model Qwen3-Max Thinking, developed by Alibaba Cloud, promises to match and even surpass the capabilities of competing artificial intelligence models Gemini 3 Pro and GPT-5.2.

venturebeat.com•January 26, 2026•

3 min read

•0 views

Qwen3-Max Thinking Stands Out in the AI Market

The new reasoning model Qwen3-Max Thinking, developed by Alibaba Cloud, promises to match and even surpass the capabilities of competing artificial intelligence models Gemini 3 Pro and GPT-5.2. The presentation occurred at a strategic moment, as the company seeks to innovate in the field of language models with an accessible and efficient proposal.

This model was introduced by the Qwen Team, known for delivering robust open-source models. Alibaba Cloud received praise, even from Airbnb CEO Brian Chesky, who commended their solutions as cost-effective alternatives to American models.

The innovation of Qwen3-Max Thinking lies in its architecture, which combines efficiency with autonomy, rewriting the rules of traditional logical reasoning.

Architecture: Redefining the Test Scale

The main innovation of Qwen3-Max Thinking is a technique called Test-time scaling. Unlike models that generate responses linearly, this approach allows the model to trade computational power for intelligence, adopting a strategy of multiple iterations.

Through a unique "take-experience" mechanism, the model refines its knowledge based on previous experiences, allowing:

Identify Dead Ends: Recognize flaws in reasoning without fully traversing the path.
Focus Compute: Direct processing power toward unresolved uncertainties.

These improvements have resulted in significant performance leaps, as demonstrated in PhD-level science benchmarks.

Integration with Adaptive Tools

Qwen3-Max Thinking distinguishes itself through the integration of adaptive tools that allow the model to autonomously choose the correct tool for each task, combining logical thinking and practical functions.

The capabilities include:

Web Search and Extraction: For real-time factual inquiries.
Memory: Store and recall specific user contexts.
Code Interpreter: Write and execute snippets of Python.

Benchmark Analysis: Facts and Results

The performance of Qwen3-Max Thinking in rigorous benchmarks, such as HMMT, showed a score of 98.0, surpassing Gemini 3 Pro and other competitors.

Additionally, in the assessment "Humanity's Last Exam", which encompasses complex issues from different disciplines, the model achieved 49.8 points, outperforming Gemini 3 Pro and GPT-5.2.

The Cost of Reasoning: Price Analysis

Alibaba Cloud positioned qwen3-max-2026-01-23 as a premium yet affordable option, with a price of $1.20 per one million input tokens.

Compared to traditional models, this cost is competitive, offering top-tier performance at a reduced price.

Developer Ecosystem

Qwen3-Max Thinking is designed for easy integration, compatible with formats from OpenAI and Anthropic, allowing developers to seamlessly incorporate this new model into their applications.

Final Considerations

The launch of Qwen3-Max Thinking marks an evolution in the AI market, focusing more on reasoning skills and autonomous tool use than merely on intelligent chatbots. With a competitive pricing model, Alibaba Cloud establishes itself as a serious contender.

The provision of free tools for a limited time encourages developers to explore the new capabilities, further intensifying the competition in the AI space.

Content selected and edited with AI assistance. Original sources referenced above.

Qwen3-Max Thinking Outperforms Gemini 3 Pro and GPT-5.2 in Reasoning Exams

TL;DR

Qwen3-Max Thinking Stands Out in the AI Market

Architecture: Redefining the Test Scale

Integration with Adaptive Tools

Benchmark Analysis: Facts and Results

The Cost of Reasoning: Price Analysis

Developer Ecosystem

Final Considerations

Share

venturebeat.com

Enjoyed this article?

Comments

Write a comment

More in Artificial Intelligence

Introduces 'Observational Memory' and Reduces AI Costs by Up to 10x

Nvidia launches DreamDojo, AI model for training robots

Google Integrates Agentive Vision into Gemini 3 Flash