Alibaba Launches an Open-Access Challenger to OpenAI’s Reasoning Model

author
By Tanu Chahal

28/11/2024

cover image for the blog

A new AI model, QwQ-32B-Preview, developed by Alibaba’s Qwen team, has emerged as a significant contender to OpenAI’s o1 reasoning model. Unlike many models in the AI space, QwQ-32B-Preview is available for download under a permissive license, making it one of the few models to rival OpenAI’s offerings with open access.

The QwQ-32B-Preview is built with 32.5 billion parameters, allowing it to handle prompts up to approximately 32,000 words long. In terms of performance, it outshines OpenAI’s o1-preview and o1-mini models in certain benchmark tests, particularly the AIME and MATH tests. AIME evaluates a model’s performance by using other AI models, while MATH consists of word problems designed to assess mathematical reasoning skills. QwQ-32B-Preview excels at logic puzzles and solving complex math problems, thanks to its advanced reasoning capabilities. However, it is not without limitations. Alibaba notes that the model can sometimes switch languages unexpectedly, get stuck in repetitive loops, or struggle with tasks requiring common-sense reasoning.

One of the unique features of reasoning models like QwQ-32B-Preview is their ability to fact-check their responses, which helps them avoid typical errors. However, this added layer of reasoning tends to make these models slower in delivering answers compared to simpler models. Much like OpenAI’s o1, QwQ-32B-Preview performs reasoning tasks by planning ahead and executing a series of actions to derive answers.

QwQ-32B-Preview is hosted on the AI development platform Hugging Face and is part of a growing trend toward models that focus on reasoning. It is similar to the recently released DeepSeek reasoning model, both of which tend to avoid certain political topics. This is partly due to regulatory influences, as Alibaba and other Chinese companies must adhere to guidelines set by China’s internet regulator. For instance, when asked about Taiwan’s status, QwQ-32B-Preview answered in line with the Chinese government’s stance, calling Taiwan “inalienable,” which is a view contrary to most of the world. Similarly, questions about sensitive topics like the Tiananmen Square protests were met with non-responses.

The QwQ-32B-Preview model is available under an Apache 2.0 license, meaning it can be used for commercial purposes. However, not all components of the model have been released, preventing full replication or a deep understanding of its internal workings. This model falls in a middle ground in terms of openness, with some parts of the model accessible, but not fully disclosed.

The rise of reasoning models like QwQ-32B-Preview comes at a time when traditional scaling laws in AI development—where increasing data and computing power result in better performance—are being questioned. Recent reports suggest that models from major AI labs, including OpenAI, Google, and Anthropic, are no longer showing the same dramatic improvements as they once did. As a result, there is increasing interest in alternative approaches to AI development, including test-time compute, which provides additional processing power during model inference to help reasoning models perform better. Google, for example, has expanded its team working on reasoning models and invested significantly in the necessary computing power.

This shift indicates that reasoning-based models and test-time compute could play a critical role in the future of AI development.