AIs show distinct bias against Black and female résumés in new study
Anyone familiar with HR practices probably knows of the decades of studies showing that resume with Black- and/or female-presenting names at the top get fewer callbacks and interviews than those with white- and/or male-presenting names-even if the rest of the resume is identical. A new study shows those same kinds of biases also show up when large language models are used to evaluate resumes instead of humans.
In a new paper published during last month's AAAI/ACM Conference on AI, Ethics and Society, two University of Washington researchers ran hundreds of publicly available resumes and job descriptions through three different Massive Text Embedding (MTE) models. These models-based on the Mistal-7B LLM-had each been fine-tuned with slightly different sets of data to improve on the base LLM's abilities in "representational tasks including document retrieval, classification, and clustering," according to the researchers, and had achieved "state-of-the-art performance" in the MTEB benchmark.
Rather than asking for precise term matches from the job description or evaluating via a prompt (e.g., "does this resume fit the job description?"), the researchers used the MTEs to generate embedded relevance scores for each resume and job description pairing. To measure potential bias, the resumewere first run through the MTEs without any names (to check for reliability) and were then run again with various names that achieved high racial and gender "distinctiveness scores" based on their actual use across groups in the general population. The top 10 percent of resumes that the MTEs judged as most similar for each job description were then analyzed to see if the names for any race or gender groups were chosen at higher or lower rates than expected.