close
close

Gottagopestcontrol

Trusted News & Timely Insights

Sam Altman’s OpenAI takes another step towards AGI with the new “O1” model
Enterprise

Sam Altman’s OpenAI takes another step towards AGI with the new “O1” model

The line between human and artificial intelligence has become narrower.

OpenAI on Thursday unveiled o1, the first in a new series of AI models that are “designed to spend more time thinking before reacting,” the company said in a blog post.

The new model can handle complex tasks and solve more difficult problems in science, coding and math compared to previous models. Essentially, they think a little more like humans than existing AI chatbots.

While previous versions of OpenAI models achieved excellent results on standardized tests such as the SAT or the Uniform Bar Examination, o1 goes a step further, according to the company. It performs “similarly to doctoral students on challenging benchmark tasks in physics, chemistry and biology.”

For example, it far outperformed GPT-4o – a multimodal model that OpenAI unveiled in May – on the qualifying exam for the International Mathematical Olympiad. GPT-4o solved only 13% of the exam questions correctly, while o1 achieved 83%, the company said.

The sharp increase in o1’s reasoning skills is due in part to an input technique known as “thought chaining.” OpenAI said o1 “learns to recognize and correct its mistakes. It learns to break down tricky steps into simpler ones. It learns to try a different approach if the current one doesn’t work.”

That’s not to say there aren’t any trade-offs compared to previous models, however. OpenAI found that while human testers preferred o1’s answers in logically demanding categories like data analysis, coding, and math, GPT-4o still came out on top in natural language tasks like writing personal texts.

OpenAI’s primary mission has long been to develop artificial intelligence (AGI), a still-hypothetical form of AI that mimics human capabilities. Over the summer, when o1 was still in development, the company unveiled a new five-tier classification system to track its progress toward that goal. Company management reportedly told employees that o1 was approaching the second tier, defined as a “thinker” with human-level problem-solving skills.

Ethan Mollick, a professor at the Wharton School of the University of Pennsylvania who has had access to o1 for over a month, said the model’s benefits are perhaps best illustrated by the way it solves crossword puzzles. Crossword puzzles are typically difficult for large language models to solve because “they require iterative solving: trying and discarding many answers that all influence each other,” Mollick wrote in a post on his Substack. Most large language models “can only add one token/word to their answer at a time.”

But when Mollick asked o1 to solve a crossword puzzle, it thought about it for “a full 108 seconds” before answering. He said its thoughts were both “insightful” and “quite impressive,” even if they weren’t entirely accurate.

Other AI experts, however, are less convinced.

Gary Marcus, a professor of cognitive science at New York University, told Business Insider the model was an “impressive feat of engineering” but not a giant leap. “I’m sure it will be praised to the skies as usual, but it definitely doesn’t come close to AGI,” he said.

Since OpenAI unveiled GPT-4 last year, the company has released ongoing iterations in its quest to invent AGI. In April, GPT-4 Turbo was made available to paying subscribers. One of the updates included the ability to generate responses that are more “conversational.”

The company announced in July that it was testing an AI search product called SearchGPT with a limited group of users.