Open-Source AI Definition Finally Gets Its First Release Candidate
An anonymous reader quotes a report from ZDNet: Getting open-source and artificial intelligence (AI) on the same page isn't easy. Just ask the Open Source Initiative (OSI). The OSI, the open-source definition steward organization, has been working on creating an open-source artificial intelligence definition for two years now. The group has been making progress, though. Its Open Source AI Definition has now released its first release candidate, RC1. The latest definition aims to clarify the often contentious discussions surrounding open-source AI. It specifies four fundamental freedoms that an AI system must grant to be considered open source: the ability to use the system for any purpose without permission, to study how it works, to modify it for any purpose, and to share it with or without modifications. So far, so good. However, the OSI has opted for a compromise regarding training data. Recognizing it's not easy to share full datasets, the current definition requires "sufficiently detailed information about the data used to train the system" rather than the full dataset itself. This approach aims to balance transparency with practical and legal considerations. That last phrase is proving difficult for some people to swallow. From their perspective, if all the data isn't open, then AI large language models (LLM) based on such data can't be open-source. The OSI summarized these arguments as follows: "Some people believe that full, unfettered access to all training data (with no distinction of its kind) is paramount, arguing that anything less would compromise full reproducibility of AI systems, transparency, and security. This approach would relegate Open-Source AI to a niche of AI trainable only on open data." The OSI acknowledges that the definition of open-source AI isn't final and may need significant rewrites, but the focus is now on fixing bugs and improving documentation. The final version of the Open Source AI Definition is scheduled for release at the All Things Open conference on October 28, 2024.
Read more of this story at Slashdot.