MIT Pulls Offline Huge Dataset That Taught AI Systems to Use Racist, Misogynistic Slurs
upstart writes in with an IRC submission:
MIT has taken offline its highly cited dataset that trained AI systems to potentially describe people using racist, misogynistic, and other problematic terms.
The database was removed this week after The Register alerted the American super-college. MIT also urged researchers and developers to stop using the training library, and to delete any copies. "We sincerely apologize," a professor told us.
The training set, built by the university, has been used to teach machine-learning models to automatically identify and list the people and objects depicted in still images. For example, if you show one of these systems a photo of a park, it might tell you about the children, adults, pets, picnic spreads, grass, and trees present in the snap. Thanks to MIT's cavalier approach when assembling its training set, though, these systems may also label women as whores or bitches, and Black and Asian people with derogatory language. The database also contained close-up pictures of female genitalia labeled with the C-word.
Read more of this story at SoylentNews.