Building and Testing Machine Learning Methods for Metadata Generation in Audiovisual Collections

Audiovisual materials play a fundamental role as historical and scientific records. AV materials provide evidence for every activity on earth, from endangered languages to rare bird calls to the sonification of underwater, melting polar ice caps. The number of these documentary records are increasing exponentially in every field in the humanities and the sciences, and yet the professionals tasked with preserving and helping to make these materials useful to scholars and the general public often lack the knowledge and resources to do so.  

What is a good system for those who have the responsibility for managing and preserving these assets? Generating metadata — which is essential for indexing and searchability — requires too much time if done manually. Using machine learning to generate metadata is promising, but information professionals must still overcome a host of technological and social challenges. This project addresses these challenges in a specific use case area by developing methodology and workflows for libraries, archives, and museums (LAMs) to use machine learning and supercomputing resources to generate metadata for AV materials in the humanities.  

The project researchers developed and tested this methodology through a pilot project involving UT Austin’s special AV collections, the professionals that process them, and a tool being built on Texas Advanced Computing Center’s (TACC) computing resources that leverages open-source speech-to-text and other machine learning (ML) applications. In the process, the project addressed research questions around defining and evaluating a “good” system for introducing AI for AV to information professionals. 

Team Members
Events
Videos
Select Publications

Maria Esteva, Weijia Xu, Tanya Clement, Aaron Choate, R. Huang, and H. Robbins-Hopkins.  “AI 4 AV (Artificial Intelligence for Audiovisual): Design and Evaluation of a Shared System for Libraries, Archives and Museums (LAMs)”. Digital Humanities 2020.