AI systems may not only reproduce data bias but even amplify it. Unfortunately, even defining data bias is difficult, let alone detecting and mitigating it. For example, consider bias by omission: transgender people, refugees or stateless people, or formerly incarcerated individuals may be simply overlooked in data. Bias can create harmful systems that commit “data violence,” negatively effecting people’s everyday lives as they interact with institutions, from social service systems and security checkpoints to employment background-check systems. Data and algorithm bias can hurt people downstream in ways that are difficult to anticipate.

Three key conclusions have emerged from recent research. First, it is very difficult to remove bias from data alone because it can creep in through various insidious ways. Second, determining the best algorithmic criterion (loss function or evaluation score) is very challenging. Finally, improvement in one criterion may increase biased in another, and neither may properly capture human evaluations of fairness.

We will engage people to help identify bias in datasets and fairness of alternative algorithms and evaluation metrics. Human-centered approaches to assess bias and fairness can address a critical gap to inform research on algorithmic fairness.


Project team: Amelia Acker (Information), Anubrata Das (Information), Joydeep Ghosh (Electrical and Computer Engineering), Soumyajit Gupta (Computer Science), Aditya Jain (Computational Engineering), Matthew Lease (Information)