Posted on Leave a comment

Can AI Score As High As A Human On A Test For “General Intelligence”? – IFLScience

CLOSE
We have emailed you a PDF version of the article you requested.
Please check your spam or junk folder
You can also addnewsletters@iflscience.comto your safe senders list to ensure you never miss a message from us.
CLOSE
Complete the form below and we will email you a PDF version
GET PDF
Cancel and go back
IFLScience needs the contact information you provide to us to contact you about our products and services. You may unsubscribe from these communications at any time.
For information on how to unsubscribe, as well as our privacy practices and commitment to protecting your privacy, check out ourPrivacy Policy
CLOSE
Complete the form below to listen to the audio version of this article
Listen
Cancel and go back
IFLScience needs the contact information you provide to us to contact you about our products and services. You may unsubscribe from these communications at any time.
For information on how to unsubscribe, as well as our privacy practices and commitment to protecting your privacy, check out ourPrivacy Policy
Advertisement
Search
Subscribe today for our Weekly Newsletter in your inbox!
Subscribe today for our Weekly Newsletter in your inbox!
Rosie McCall
Rosie McCall
Freelance Writer
Rosie is a freelance writer living in London. She has covered everything from ancient Egyptian temples to exciting medical breakthroughs, but she particularly enjoys writing about wildlife, anthropology and the wonders of the human mind.
BookView full profile
BookRead IFLScience Editorial Policy
Freelance Writer
Katy Evans
Managing Editor
Katy is Managing Editor at IFLScience where she oversees editorial content from News articles to Features, and even occasionally writes some.
BookView full profile
BookRead IFLScience Editorial Policy
DOWNLOAD PDF VERSION
OpenAI’s latest software scored 82.8 percent on a test for artificial general intelligence. That puts it in league with the average human, say researchers. 
Image credit: Krot_Studio/Shutterstock.com
DOWNLOAD PDF VERSION
AI has smashed records on a program designed to test “general intelligence”, achieving a score on a level with those of the average person. 
Historically, researchers have looked to the Turing Test to measure machine intelligence. To pass, a machine must convince a human that it too is a person. By some accounts, technology has already accomplished this feat. Indeed, ChatGPT may have cracked the test earlier this year. However, scientists question whether this can determine true intelligence. 
As an alternative, software engineer and AI researcher Francois Chollet created the ARC-AGI benchmark test, software designed to measure “artificial general intelligence” (or AGI). According to Chollet, “AGI is a system that can efficiently acquire new skills outside of its training data.” 
On this measure, ChatGPT would fail. The technology relies on probability and vast amounts of data to predict the most likely series of words to any given output. It is extraordinarily talented at creating content. However, Chollet would argue that true general intelligence is not so much about the skill (in this case, generating content) but its ability to acquire that skill in the first place without a huge amount of input. This is an ability ChatGPT lacks.

Therefore, to pass the ARC-AGI benchmark test, AI must complete a series of reasoning problems based on colored squares in a grid. Its task is to identify the pattern that turns one grid into another grid and it is given just three examples to learn from. The previous record (held by Jeremy Berman) was 58.5 percent. That record was smashed by OpenAI’s new o3 system, which scored an impressive 82.8 percent – and arguably puts it in league with humans, Chollet says.
In a blog piece, Chollet describes it as “a significant leap forward” representing a “genuine breakthrough in adaptability and generalization”. He said, “This is not just incremental progress; it is new territory, and it demands serious scientific attention.”
To put it in some context, four years ago, GPT-3 scored a less-than-impressive 0 percent. In 2024, GPT-4o did not do much better at 5 percent. Needless to say, there has been a dramatic rate of improvement. Still, there is no need to get too hasty. As Chollet himself points out, the o3 system still performs badly on some simple tasks.
While there have been some impressive developments when it comes to AI, there is little consensus amongst AI researchers on when we should expect to see true AGI. Some believe it is something we could see by the end of the decade. In a recent talk, Ben Goertzel, founder of SingularityNET, argued individual computers would have power equivalent to a human brain by 2023. “Then you add another 10/15 years on that, an individual computer would have roughly the compute power of all of human society.”
artificial intelligence,
turing test
link to article
link to article
link to article
Advertisement
Advertisement
Advertisement
link to article
link to article
link to article
Sign up today to get weekly science coverage direct to your inbox
&copy 2024 IFLScience. All Rights Reserved. RSS

source

Leave a Reply

Your email address will not be published. Required fields are marked *