Sixth 'Hutter Prize' Awarded for Achieving New Data Compression Milestone
Since 2006, Slashdot has been covering a contest CmdrTaco once summarized as "Compress Wikipedia and Win." It rewards progress on compressing a 1-billion-character excerpt of Wikipedia - approximately the amount that a human can read in a lifetime. And today a new record was announced. The 1 billion characters have now been compressed to just 114,156,155 bytes - about 114 megabytes, or just 11.41% of the original size - by Saurabh Kumar, a New York-based quantitative developer for a high-frequency/algorithmic trading and financial services fund. The amount of each "Hutter Prize for Lossless Compression of Human Knowledge" increases based on how much compression is achieved (so if you compress the file x% better you receive x% of the prize). Kumar's compression was 1.04% smaller than the previous record, so they'll receive 5187. But "The intention of this prize is to encourage development of intelligent compressors/programs as a path to AGI," said Marcus Hutter (now a senior researcher at Google DeepMind) in a 2020 interview with Lex Fridman. 17 years after their original post announcing the competition, Baldrson (Slashdot reader #78,598) returns to explain the contest's significance to AI research, starting with a quote from mathematician Gregory Chaitin - that "Compression is comprehension." But they emphasize that the contest also has one specific hardware constraint rooted in theories of AI optimization:The Hutter Prize is geared toward research in that it restricts computation resources to the most general purpose hardware that is widely available. Why? As described by the seminal paper "The Hardware Lottery" by Sara Hooker, AI research is biased toward algorithms optimized for existing hardware infrastructure. While this hardware bias is justified for engineering (applying existing scientific understanding to the "utility function" of making money) to quote Sara Hooker, it "can delay research progress by casting successful ideas as failures." The complaint that this is "mere" optimization ignores the fact that this was done on general purpose computation hardware, and is therefore in line with the spirit of Sara Hookers admonition to researchers in "The Hardware Lottery". By showing how to optimize within the constraint of general purpose computation, Saurabh's contribution may help point the way toward future directions in hardware architecture.
Read more of this story at Slashdot.