Anthropic Maps AI Model 'Thought' Processes

msmash

from Slashdot on 2025-03-28 06:14 (#6W7VP)

Anthropic researchers have developed a breakthrough "cross-layer transcoder" (CLT) that functions like an fMRI for large language models, mapping how they process information internally. Testing on Claude 3.5 Haiku, researchers discovered the model performs longer-range planning for specific tasks -- such as selecting rhyming words before constructing poem sentences -- and processes multilingual concepts in a shared neural space before converting outputs to specific languages. The team also confirmed that LLMs can fabricate reasoning chains, either to please users with incorrect hints or to justify answers they derived instantly. The CLT identifies interpretable feature sets rather than individual neurons, allowing researchers to trace entire reasoning processes through network layers.

Source	RSS or Atom Feed
Feed Location	https://rss.slashdot.org/Slashdot/slashdotMain
Feed Title	Slashdot
Feed Link	https://slashdot.org/
Feed Copyright	Copyright Slashdot Media. All Rights Reserved.