After months of speculation and anticipation, OpenAI has released the production version of its advanced reasoning model, Project Strawberry, which has been renamed “o1.” It is joined by a “mini” version (just as GPT-4o was) that will offer faster and more responsive interactions at the expense of leveraging a larger knowledge base.
It appears that o1 offers a mixed bag of technical advancements. It’s the first in OpenAI’s line of reasoning models designed to use humanlike deduction to answer complex questions on subjects — including science, coding, and math — faster than humans can.
For example, during testing, o1 was fed a qualifying exam for the International Mathematics Olympiad. While its predecessor, GPT-4o, only managed to correctly solve 13% of the problems presented, o1 got 83% of them right. In an online Codeforces competition, o1 scored in the 89th percentile. What’s more, o1 can respond to queries that stumped previous models (like, “which is bigger, 9.11 or 9.9?”). However, the company makes clear that this release is only a preview of the neophyte model’s full capabilities.
The new o1 “has been trained using a completely new optimization algorithm and a new training dataset specifically tailored for it,” OpenAI’s research lead, Jerry Tworek, told The Verge. Using a combination of reinforcement learning and “chain of thought” reasoning, o1 reportedly returns more accurate inferences than its predecessor. “We have noticed that this model hallucinates less,” Tworek said, however, “we can’t say we solved hallucinations.”
Both ChatGPT-Plus and Teams subscribers will be able to test out o1 and o1-mini beginning today. Enterprise and Edu subscribers should have access by next week.
The company says that o1-mini will eventually become available to free-tier users, though it did not specify a timeline. Developers will notice a steep increase in the API pricing for o1, compared to GPT-4o. Access to o1 will cost $15 per million input tokens (compared to $5 per million for GPT-4o) and $60 per million output tokens, four times more than 4o’s $5 per million fee. The real question is whether the new model thinks the word “strawberry” contains two R’s or three.
Related Posts
New study shows AI isn’t ready for office work
A reality check for the "replacement" theory
Google Research suggests AI models like DeepSeek exhibit collective intelligence patterns
The paper, published on arXiv with the evocative title Reasoning Models Generate Societies of Thought, posits that these models don't merely compute; they implicitly simulate a "multi-agent" interaction. Imagine a boardroom full of experts tossing ideas around, challenging each other's assumptions, and looking at a problem from different angles before finally agreeing on the best answer. That is essentially what is happening inside the code. The researchers found that these models exhibit "perspective diversity," meaning they generate conflicting viewpoints and work to resolve them internally, much like a team of colleagues debating a strategy to find the best path forward.
Microsoft tells you to uninstall the latest Windows 11 update
https://twitter.com/hapico0109/status/2013480169840001437?s=20