📋 Description
Disparate outcomes occur when an AI system treats or affects different groups of people unequally, often due to biased data, poor generalization, or overlooked systemic inequities. This can result in reduced access to services, unfair treatment, or unequal risk exposures for certain populations. It is especially critical in systems used for hiring, healthcare, finance, law enforcement, or public benefits. Measuring and mitigating these disparities is essential to ensure ethical and equitable AI deployment.
The following key considerations apply to most system types. Additional considerations should be discussed based on a specific industry or use case.
**Demographic Dimensions**: Fairness can be quantified across several different dimensions, including race, sex, disability, and others. The decision of which dimension to evaluate will depend on legal requirements, availability of data, and ethical considerations. Granularity (e.g., broad categories like Asian Pacific Islander vs. specific origins) and edge cases (e.g., multiracial or non-binary individuals) should be carefully considered with input from technical, legal, and impacted community stakeholders.
**Choice of Metrics**: Common group fairness measures include demographic parity (equal approval rates across groups) and equalized odds (equal true/false positive rates). Organizations must prioritize fairness metrics based on the highest real-world impact. For example, false positives may be critical in criminal justice applications. Use ratio-based measures for more stable comparisons.
**Data Availability**: Measuring fairness requires demographic data. This can be collected directly or inferred using proxies such as ZIP code or name. Each has tradeoffs—voluntary disclosure may be incomplete due to privacy concerns; proxies can be less accurate. Metrics must disclose which method was used, and privacy-preserving approaches should be prioritized.