Ethical Data Design for Good Systems


Data and the systems that manage it are not neutral but are part of the process that affects how AI-based technologies work. Not all data and computer scientists, however, design technology to operate in the best interest of end users, from individuals to institutions, and ethical values may mean different things to data producers, consumers, and organizations. 

Designing and building good systems is a continuous process fundamentally intertwined with ethical data management. However, the ethical frameworks that should guide data gathering and the systems that manage it are spotty, subject to little oversight, few guidelines, and uneven monitoring and enforcement. Moreover, the complexities involved in large data aggregations, transformations, distribution, and reuse, and the limited capacity to validate ethical implications embedded in routine data practices make it difficult to track and prevent ethical breaches. This Good Systems project investigated how data ethics can be a point of departure in designing and evaluating good systems, examining the contradictions and pressure points among various data practices. 


The study involved quantitative analysis of social science and engineering datasets as well as interviews with a spectrum of organizations and agents that produce, manage, analyze and consume this data. Natural Hazards data was used as a case study. Tensions between best practices and financial constraints, professional values and academic incentives, protection of privacy, and the availability of security solutions, among others, are emergent ethical themes. Mapped to academic research lifecycle stages, the themes represent the values, the risks, the rewards, and the collaboration contexts of data practices.  


Results suggest that data producers, consumers, and organizations have some differing notions about ethical data management, and that more coordination would benefit the data eco-system.



Team Members

Soyoung Park
Arthur Peters
Computer Science
John Thywissen
Computer Science


Select Publications

Maria Esteva and Sharon Strover. "Towards ethical data management, distribution, and use for AI applications." IFLA (International Federation of Libraries Associations and Institutions IT Section, Big Data SIG Meeting. August 29, 2020.