Participation-washing could be the next dangerous fad in machine learning
The AI community is finally waking up to the fact that machine learning can cause disproportionate harm to already oppressed and disadvantaged groups. We have activists and organizers to thank for that. Now, machine-learning researchers and scholars are looking for ways to make AI more fair, accountable, and transparent-but also, recently, more participatory.
One of the most exciting and well-attended events at the International Conference on Machine Learning in July was called Participatory Approaches to Machine Learning." This workshop tapped into the community's aspiration to build more democratic, cooperative, and equitable algorithmic systems by incorporating participatory methods into their design. Such methods bring those who interact with and are affected by an algorithmic system into the design process-for example, asking nurses and doctors to help develop a sepsis detection tool.
This is a much-needed intervention in the field of machine learning, which can be excessively hierarchical and homogenous. But it is no silver bullet: in fact, participation-washing" could become the field's next dangerous fad. That's what I, along with my coauthors Emanuel Moss, Olaitan Awomolo, and Laura Forlano, argue in our recent paper Participation is not a design fix for machine learning."
Ignoring patterns of systemic oppression and privilege leads to unaccountable machine-learning systems that are deeply opaque and unfair. These patterns have permeated the field for the last 30 years. Meanwhile, the world has watched the exponential growth of wealth inequality and fossil-fuel-driven climate change. These problems are rooted in a key dynamic of capitalism: extraction. Participation, too, is often based on the same extractive logic, especially when it comes to machine learning.
Participation isn't freeLet's start with this observation: participation is already a big part of machine learning, but in problematic ways. One way is participation as work.
Whether or not their work is acknowledged, many participants play an important role in producing data that's used to train and evaluate machine-learning models. Photos that someone took and posted are scraped from the web, and low-wage workers on platforms such as Amazon Mechanical Turk annotate those photos to make them into training data. Ordinary website users do this annotation too, when they complete a reCAPTCHA. And there are many examples of what's known as ghost work-anthropologist Mary Gray's term for all the behind-the-scenes labor that goes into making seemingly automated systems function. Much of this participation isn't properly compensated, and in many cases it's hardly even recognized.
Participation as consultation, meanwhile, is a trend seen in fields like urban design, and increasingly in machine learning too. But the effectiveness of this approach is limited. It's generally short lived, with no plan to establish meaningful long-term partnerships. Intellectual-property concerns make it hard to truly examine these tools. As a result, this form of participation is too often merely performative.
More promising is the idea of participation as justice. Here, all members of the design process work together in tightly coupled relationships with frequent communication. Participation as justice is a long-term commitment that focuses on designing products guided by people from diverse backgrounds and communities, including the disability community, which has long played a leading role here. This concept has social and political importance, but capitalist market structures make it almost impossible to implement well.
Machine learning extends the tech industry's broader priorities, which center on scale and extraction. That means participatory machine learning is, for now, an oxymoron. By default, most machine-learning systems have the ability to surveil, oppress, and coerce (including in the workplace). These systems also have ways to manufacture consent-for example, by requiring users to opt in to surveillance systems in order to use certain technologies, or by implementing default settings that discourage them from exercising their right to privacy.
Given that, it's no surprise that machine learning fails to account for existing power dynamics and takes an extractive approach to collaboration. If we're not careful, participatory machine learning could follow the path of AI ethics and become just another fad that's used to legitimize injustice.
A better wayHow can we avoid these dangers? There is no simple answer. But here are four suggestions:
Recognize participation as work. Many people already use machine-learning systems as they go about their day. Much of this labor maintains and improves these systems and is therefore valuable to the systems' owners. To acknowledge that, all users should be asked for consent and provided with ways to opt out of any system. If they chose to participate, they should be offered compensation. Doing this could mean clarifying when and how data generated by a user's behavior will be used for training purposes (for example, via a banner in Google Maps or an opt-in notification). It would also mean providing appropriate support for content moderators, fairly compensating ghost workers, and developing monetary or nonmonetary reward systems to compensate users for their data and labor.
Make participation context specific. Rather than trying to use a one-size-fits-all approach, technologists must be aware of the specific contexts in which they operate. For example, when designing a system to predict youth and gang violence, technologists should continuously reevaluate the ways in which they build on lived experience and domain expertise, and collaborate with the people they design for. This is particularly important as the context of a project changes over time. Documenting even small shifts in process and context can form a knowledge base for long-term, effective participation. For example, should only doctors be consulted in the design of a machine-learning system for clinical care, or should nurses and patients be included too? Making it clear why and how certain communities were involved makes such decisions and relationships transparent, accountable, and actionable.
Plan for long-term participation from the start. People are more likely to stay engaged in processes over time if they're able to share and gain knowledge, as opposed to having it extracted from them. This can be difficult to achieve in machine learning, particularly for proprietary design cases. Here, it's worth acknowledging the tensions that complicate long-term participation in machine learning, and recognizing that cooperation and justice do not scale in frictionless ways. These values require constant maintenance and must be articulated over and over again in new contexts.
Learn from past mistakes. More harm can be done by replicating the ways of thinking that originally produced harmful technology. We as researchers need to enhance our capacity for lateral thinking across applications and professions. To facilitate that, the machine-learning and design community could develop a searchable database to highlight failures of design participation (such as Sidewalk Labs' waterfront project in Toronto). These failures could be cross-referenced with socio-structural concepts (such as issues pertaining to racial inequality). This database should cover design projects in all sectors and domains, not just those in machine learning, and explicitly acknowledge absences and outliers. These edge cases are often the ones we can learn the most from.
It's exciting to see the machine-learning community embrace questions of justice and equity. But the answers shouldn't bank on participation alone. The desire for a silver bullet has plagued the tech community for too long. It's time to embrace the complexity that comes with challenging the extractive capitalist logic of machine learning.
Mona Sloane is a sociologist based at New York University. She works on design inequality in the context of AI design and policy.