#llm#research#psychology

How Smart Is AI Compared to Humans? A New Study Puts It to the Test

A recent study compares generative AI models to human cognitive benchmarks, revealing both strengths and significant weaknesses in AI's intellectual abilities.

Photo source

Oct 15, 2024
By leeron

As artificial intelligence (AI) continues to grow in capability, it raises the question: How close are these models to replicating true human intelligence?

A recent study, conducted by researchers from Google Research and Google DeepMind, benchmarks several leading generative AI models against human cognitive abilities using the Wechsler Adult Intelligence Scale (WAIS-IV).

This test, typically used to measure human intelligence, offers a unique perspective on the cognitive abilities of AI compared to human performance.

The researchers evaluated large language models (LLMs) and vision language models (VLMs) on three key areas: Verbal Comprehension (VCI), Working Memory (WMI), and Perceptual Reasoning (PRI).

Benchmarking Models Against Human Performance on the Wechsler Adult Intelligence
Scale
Benchmarking Models Against Human Performance on the Wechsler Adult Intelligence Scale

The results showed that most models performed exceptionally well in tasks related to Verbal Comprehension and Working Memory, with some reaching or even surpassing the 99.5th percentile compared to human norms. These strengths indicate that AI has an incredible capacity to store, retrieve, and manipulate linguistic and numerical information.

However, the study also highlighted significant weaknesses. In the area of Perceptual Reasoning—the ability to understand and reason about visual information—multimodal AI models fell far short, with scores generally below the 10th percentile of human performance. This disparity reveals a profound gap in AI's current capabilities when it comes to understanding and processing visual patterns, an ability where even advanced models struggle.

The study further underscores the importance of understanding the unique strengths and limitations of AI. While AI can excel in verbal and memory-based tasks, it still faces significant challenges in areas requiring abstract visual reasoning.

This knowledge not only helps us track the progress of AI towards artificial general intelligence (AGI) but also points to specific areas for improvement, particularly in how AI models process visual information.

This kind of comparative analysis between AI and human cognition is crucial for assessing AI's progress and identifying the areas that need further development.

article
Galatzer-Levy, I. R., Munday, D., McGiffin, J., Liu, X., Karmon, D., Labzovsky, I., ...McDuff, D. (2024). The Cognitive Capabilities of Generative AI: A Comparative Analysis with Human Benchmarks. arXiv, 2410.07391. Retrieved from https://arxiv.org/abs/2410.07391v1

Recommended Reading

Discover more insights and stories from our curated selection

#embodiedai#agent

A New Benchmark for Embodied AI: Evaluating LLMs in Decision Making

schedule Oct 14, 2024

New benchmark unifies how we evaluate language models for decision-making in embodied environments, revealing strengths and areas for improvement.

#automation#research

Human-Like Automation Framework for Computer Tasks

schedule Oct 12, 2024

Agent S enables computers to autonomously handle complex tasks in a human-like way, improving efficiency, adaptability, and accessibility for a wide range of GUI interactions.

#agent#development

The Rise of Proactive AI Assistants Enhancing Programmer Productivity

schedule Oct 11, 2024

How proactive AI assistants could reshape programming workflows with increased productivity and smarter collaboration.

#research#agent

Autonomous Digital Agents Are Getting Smarter: A New Method for Evaluation and Refinement

schedule Oct 11, 2024

New research showcases a powerful automated approach to evaluating and improving digital agents, enhancing their capabilities significantly.

#llm#embodiedai

The Intersection of Embodied AI and LLMs: Unveiling New Security Threats

schedule Oct 10, 2024

As LLMs are fine-tuned for embodied AI systems like autonomous vehicles and robots, new security risks emerge. A framework identifies backdoor attacks with success rates up to 100%, posing significant threats to these systems' safety.