Article 6K3Z9 Anthropic’s Claude 3 causes stir by seeming to realize when it was being tested

Anthropic’s Claude 3 causes stir by seeming to realize when it was being tested

by
Benj Edwards
from Ars Technica - All content on (#6K3Z9)
robot_eureka_1-800x450.jpg

Enlarge (credit: Getty Images)

On Monday, Anthropic prompt engineer Alex Albert caused a small stir in the AI community when he tweeted about a scenario related to Claude 3 Opus, the largest version of a new large language model launched on Monday. Albert shared a story from internal testing of Opus where the model seemingly demonstrated a type of "metacognition" or self-awareness during a "needle-in-the-haystack" evaluation, leading to both curiosity and skepticism online.

Metacognition in AI refers to the ability of an AI model to monitor or regulate its own internal processes. It's similar to a form of self-awareness, but calling it that is usually seen as too anthropomorphizing, since there is no "self" in this case. Machine-learning experts do not think that current AI models possess a form of self-awareness like humans. Instead, the models produce humanlike output, and that sometimes triggers a perception of self-awareness that seems to imply a deeper form of intelligence behind the curtain.

In the now-viral tweet, Albert described a test to measure Claude's recall ability. It's a relatively standard test in large language model (LLM) testing that involves inserting a target sentence (the "needle") into a large block of text or documents (the "haystack") and asking if the AI model can find the needle. Researchers do this test to see if the large language model can accurately pull information from a very large processing memory (called a context window), which in this case is about 200,000 tokens (fragments of words).

Read 11 remaining paragraphs | Comments

External Content
Source RSS or Atom Feed
Feed Location http://feeds.arstechnica.com/arstechnica/index
Feed Title Ars Technica - All content
Feed Link https://arstechnica.com/
Reply 0 comments