Preview of model o1
New ChatGPT variant should be able to solve tricky questions
This audio version was artificially generated. More info | Send feedback
While the big hype around AI seems to be dying down, OpenAI is continuing to expand: the company is introducing an improved version of its AI chatbot ChatGPT. He should be able to solve difficult math problems and correct mistakes independently. However, one weakness remains.
First think, then answer: The developers at OpenAI also applied this tip that parents sometimes give to their children to the latest version of their AI chatbot. The software, called o1, spends more time “thinking” before giving an answer – “just like a person would do it,” according to a statement from the company.
In this way, the new model should also be able to solve more complex tasks than previous chatbots. Artificial intelligence tries out different approaches and recognizes and corrects its own errors, OpenAI explains in a blog entry.
This is particularly evident in mathematics and programming. The o1 model solved 83 percent of the tasks in the International Mathematics Olympiad, while ChatGPT-4o only managed 13 percent. However, the new model still lacks many of the functions that ChatGPT already offers: it cannot search for information on the web or upload files and images – and it is slower. From OpenAI's perspective, the new model can help researchers with data analysis or physicists with complex mathematical formulas.
0.38 percent knowingly incorrect
However, data published by OpenAI also shows that the new model knowingly gave incorrect answers in 0.38 percent of 100,000 test requests. This mainly happened when OpenAI o1 was asked for articles, websites or books – not possible without an internet search. The software then invented plausible examples. This happened because she always wanted to fulfill the wishes of the users. Such “hallucinations,” in which AI software invents information, remain an unsolved problem.
ChatGPT, the chatbot that sparked the artificial intelligence hype over a year ago, was trained with huge amounts of data. Such programs can formulate human-level texts, write software codes and summarize information. They estimate word by word how a sentence should continue.