FAccT Finding: AI Takeaways from ACM FAccT 2025

July 15, 2025

Anastassia Kornilova is the Director of Machine Learning at Trustible. Anastassia translates research into actionable insights and uses AI to accelerate compliance with regulations. Her notable projects have involved creating the Trustible Model Ratings and AI Policy Analyzer. Previously, she has worked at Snorkel AI developing large-scale machine learning systems, and at FiscalNote developing NLP models for legislation and regulation.

This June, I attended the FAccT (Fairness, Accountability and Transparency) Conference hosted by the Association for Computing Machinery in Athens, Greece to learn about new advances in Responsible AI from an interdisciplinary community of researchers from computer science, law, social sciences, and humanities. The conference featured research on both traditional fairness in machine learning and generative AI safety, and in addition, included legal discussions of AI law, meta-analyses on AI research and new frameworks for building socio-technical systems. While much of the conference was focused on critically analyzing AI systems, researchers also presented novel solutions for building better AI systems overall.These were my top takeaways from the event:

Affected populations need to be involved in system testing.
Measurement and mitigation for fairness concerns remains a major challenge.
Bias in AI systems can emerge from ALL components involved in training and inference.
LLMs may be general-purpose – but sector-specific research is still necessary.

When considering “fairness” and AI, we analyze whether AI systems affect different populations differently. The general approaches for this have split between traditional, predictive machine learning and generative AI. For the first, the practitioners often focus on statistical analysis (based on a couple of broadly available packages); for the later, several public benchmarks exist that test how system outputs differ based on inputs – but these are not consistently used by model developers. FAccT shed light on the limitations of both approaches.

Here are my thoughts on the key takeaways:

Takeaway 1

Affected populations need to be involved in system development and testing.

While general-purpose AI can be used for a variety of purposes, its effectiveness will vary depending on the audience. Unless the target population is involved in development and testing – the poor performance may be missed. This process is nuanced, but multiple papers provide case studies across applications and populations.

For example, while LLMs can be used in content moderation and can accurately classify many types of toxic language, they have gaps that systematically affect certain populations. In the paper Understanding Gen Alpha’s Digital Language: Evaluation of LLM Safety Systems for Content Moderation, Manisha Mehta (a middle schooler and member of Gen Alpha herself) and her coauthor shows how several leading AI systems fail to understand subtle harassment in online comments made by Gen Alpha. The AI systems perform comparably to parents and moderators (both of these groups and LLMs identified harmful comments with 30-40% accuracy compared to 92% for members of Gen Alpha), showing that input from this generation is needed to create better data. Similarly, in Cold, Calculated, and Condescending”: How AI Identifies and Explains Ableism Compared to Disabled People, the authors show that top LLMs both overrate (i.e. rate harmless comments at toxic) and underrate ableist comments. In addition, they show that People with Disabilities (PwDs) and non-PwDs rate the toxicity of comments differently, pointing to a need to involve disabled individuals in the development of these systems. Beyond content moderation, other papers provide insights from collaboratively designing Automatic Speech Recognition technology for Aboriginal Australians and LLMs for Journalists.

Participatory methods – frameworks for including affected stakeholders into AI development – are not new; extensive works (including those focused on foundation models) have been developed. However, adoption in commercial settings remains limited. A new study surveyed practitioners and found that RAI-related stakeholder involvement (SHI) was perceived at odds with commercial priorities and that confusion persisted on what “involvement” should entail. Creation of use case-specific SHI guidance could close the gap, but this highlights a broader top-of-mind question at FAccT: How can this research influence real policy?

Takeaway 2

Measurement fairness is predictive systems remains a major challenge.

A diverse set of fairness measurements has been developed over years of research, but each approach has limitations. Several papers at FAccT highlight limitations and suggest additional considerations:

Individual Fairness (IF) is the principle that similar individuals should be treated similarly by algorithms. It has received little attention, in contrast to group fairness that asserts that groups (typically based on protected attributes like race or gender) should be treated equally. In Beyond Consistency: Nuanced Metrics for Individual Fairness, Waller et al. highlight problems with the consistency score (main IF metric) and create a toolkit with four new measures. The new measures better capture when specific individuals are treated differently from similar counterparts and consider multiple notions of similarity.

Algorithmic Recourse (AR) aims to identify counterfactual explanations that users can follow to overturn unfavourable machine decisions. For example, telling a user that increasing their income by X dollars will result in having a loan approved. In Time Can Invalidate Algorithmic Recourse, Toni et al. highlight that AR advice does not take into account the real-world time needed to implement these measures (e.g. realistic time to increase income) and that this gap may render the advice useless. They propose an algorithm that takes the time factor into account. A different study by Perello et al. shows that algorithmic recourse can induce discrimination, independently of the bias in the original model. These works highlight how algorithmic recourse measures have generally been considered a solution and have not been scrutinized through a fairness lens.

These studies highlight the interdisciplinary nature of fairness evaluation: quantifying individual similarity and the “time/effort” involved in implementing AR measures requires a diverse set of stakeholders and can not be addressed by a Machine Learning engineer alone.

Takeaway 3

Bias in AI systems can emerge from all components involved in training and inference.

When evaluating LLMs (and other Generative AI) for bias, analysis has generally focused on reviewing the training data and the final model. Several papers at FAccT highlight how additional sources of bias need to be considered. The post-training process uses ‘reward models’ to rate the LLM’s responses to prompts and push it towards desired (i.e. helpful and harmless) behavior. While reward models play a crucial step in the development of LLMs, their own behavior has been understudied.

Two new papers show that reward models encode significant biases against identity groups and subtly influence the LLMs that they augment (see Kumar et al. and Christian et al.). The first study tests how “demographic prefixes” in outputs (e.g. I am a woman) alter model behavior and find significant racial and gender biases linked to datasets used to train the reward models. The later study takes a different approach and compares the reward model’s score for every possible one-word response to a “value-laden” prompts; they find similar biases against demographic groups. In addition, Buyl et al show that reward models exercise “discretion” in borderline cases (e.g. when deciding if a model output is harmful) and present systemic ways to analyze this behavior.

If the reward models’ preferences are biased, they poison the downstream models and cause undesirable learned behavior. Thus, it is important to examine their behavior more closely. The methodologies in these papers can be used to measure bias more closely. For example, a new paper by Neumann et al. used an approach similar to “demographic prefixes” to show how demographic information on the user in the system prompt can significantly alter outputs in neutral situations (e.g. where gender should not affect the response). While not featured at FAccT this year, conversations have highlighted biases associated with using LLM-as-a-Judge, which has become a standard tool for large-scale evaluations.

Takeaway 4

LLMs may be general-purpose – but sector-specific risk research is still necessary.

While most FAccT scholarship focuses on studying general properties of LLMs and universal properties of fairness, several studies highlight sector-specific challenges. These works provide grounded insights into how AI systems are used and provide frameworks that can be used to evaluate risks for specific applications.

Financial Services: Gehrmann et al. create a custom AI risk taxonomy for financial services organizations, including entries like ‘Financial Services Impartiality’ (i.e. giving financial advice) that are not covered in many standard taxonomies. These risks are more nuanced and may manifest differently depending on organization type (e.g. buy-side and sell-side firms have different limitations of the types of advice they can give).
Medical: Gourabathina et al. show how non-clinical aspects of patient input texts can affect clinical decision-making by LLMs. Their findings reveal “notable inconsistencies in LLM treatment recommendations and significant degradation of clinical accuracy in ways that reduce care allocation to patients.” In particular, they provide a reusable framework and code that can be used to modify prompts and measure the effect.
E-Commerce: Kelly et al. create a taxonomy of specific gender-related biases in AI-generated product descriptions. These categories allow for more fine-grained analysis than traditional bias-related risk labels.

Conclusion

The need for better model evaluation has been in focus the last few months, but FAccT emphasized how we can’t think about them in isolation – the people affected, the application and the whole system needs to be considered. This is not a revelation, but FAccT shines a light on creative case studies and elements of the systems that don’t usually get emphasized.

A prevailing question for the researchers throughout the conference: how to make their work impact policy and real-world systems.

One challenge with creating up-to-date evaluations, given the slow nature of the review cycle, is that none of the papers featured analysis of the SotA models released in Spring 2025. There are currently insufficient incentives to create public, reusable dashboards. However, multiple works do provide code and hands-on guidance; notably A Framework for Auditing Chatbots for Dialect-Based Quality-of-Service Harms provides detailed instructions for conducting risk evaluations that can be used by system developers or auditors. Lastly, and perhaps surprisingly, AI agents received very little attention, despite being a major focus for industry developers.

My experience, and the insights I gained, at FAccT, in addition to the ongoing industry participation and collaboration all of us do here at Trustible, are helping to bridge the gap between research and practice.

AI Governance Best Practices for Healthcare Systems and Pharmaceutical Companies

In the rapidly evolving landscape of healthcare, AI promises to revolutionize patient care, but it also brings significant risks. From algorithmic bias to data privacy breaches, the stakes are high. Effective AI governance is essential to harness the benefits of these technologies while safeguarding patient safety and ensuring compliance with regulations. This article delves into the critical challenges healthcare systems and pharmaceutical companies face, offering practical solutions and best practices for implementing trustworthy AI. Discover how to navigate the complexities of AI in healthcare and protect your organization from potential pitfalls.

Introducing the Trustible AI Governance Insights Center

At Trustible, we believe AI can be a powerful force for good, but it must be governed effectively to align with public benefit. Introducing the Trustible AI Governance Insights Center, a public, open-source library designed to equip enterprises, policymakers, and consumers with essential knowledge and tools to navigate AI’s risks and benefits. Our comprehensive taxonomies cover AI Risks, Mitigations, Benefits, and Model Ratings, providing actionable insights that empower organizations to implement robust governance practices. Join us in transforming the conversation around trusted AI into tangible, measurable outcomes. Explore the Insights Center today!