Jonathan Reus

DADAsets is a response to the cultural and economic ecosystem of voice AI and voice data that is rapidly terraforming the meaning and function of voice. The project involves researching the development of open digital music tools and bespoke voice datasets that challenge the popular narratives and agendas around voice AI – which focus on the spectacle and fears around the technological results, such as digital clones that perfectly reproduce the voice of a famous narrator or pop singer. DADAsets aims to create work that playfully and artistically foregrounds less visible labor and relationships around voice AI. The dominant narrative of Generative AI is often focused on the shock and awe of its visible results. The goal of DADAsets is to decenter the spectacle and instead foreground AI’s total dependence on a complex ecosystem of training data, often obscured and involving huge amounts of hidden labor.

The project will involve a series of public workshops to map out communities of voice AI and voice data, and create a proof-of-concept public dataset (DADAset) as an archetype for future diverse and ethically sourced voice datasets, especially those that fall outside of the mainstream economies of voice data and music AI. These DADAsets aim to be carefully crafted in collaboration with vocal artists who fall outside of these cultural and economic value systems, such as experimental vocal artists who have developed a particularly unique craft and vocal communities across different traditions and cultures, and will be released under a speculative fair use license. For the first DADAset I am fortunate to be collaborating with Jaap Blonk, a renowned Dutch sound poet and performer of dadaist sound poetry.

In order to experience DADAsets, we will create an AI voice synthesis instrument “Tungnaa” (named after the Icelandic “River of Tongues”), an open software instrument meant to be trained on DADAsets. We will strive to make Tungnaa a hackable, fun and playful tool that allows artists to explore the unique aesthetics of neural network generated audio. Tungnaa should be able to run on a modest laptop without the need for high-end GPU computing resources, and without its underlying technology being hidden behind a web-based gatekeeping portal or paid service like most voice AI tools today.

Inspired by live coding and the typographical experiments of the postwar Dada art movement, Tungnaa also invites artists to invent their own text-based vocal notation systems, and to explore possible vocal notation systems for all the possible things a human voice can do – including those that exist beyond the narrow focus of conventional language or singing. 

The project is developed through a collaboration with the core AIR hub PINA, in Koper, and with composer Mauricio Valdes, who runs PINA’s spatial sound lab HEKA. Once Tungnaa is up and running, our aim is to start experimenting with immersive compositional approaches for artificial voice. Also, combining my background in new digital musical instruments and Mauricio’s expertise in immersive audio, we have identified many challenges in making sophisticated spatial audio technology accessible and engaging from the embodied perspective of a musician. We have discussed many new gestural approaches to immersive audio composition, aimed at making spatial audio more artist-friendly, and together we are aiming to write a Manifesto for Immersive Sound, which will sketch out a way forward for bridging the accessibility gap so that more musicians of varying technical skills can explore this medium.

Over the course of AIR, DADAsets will involve public presentations and workshops that will reflect on the research, fostering a dialogue on the social/digital/economic ecologies around voice data/AI with diverse communities.

Overall, DADAsets is poised to make significant contributions to the field of voice data/AI by challenging popular narratives, promoting a values-first approach to technology creation.

Mauricio Valdes point of view

In early February, Jonathan visited PINA to explore the spatial audio studio and discuss the accessibility issues in immersive audio with Mauricio. They identified several challenges, including the absence of intuitive software tools and the need for a distinct compositional vocabulary for spatial sound. This initial brainstorming session has since evolved into ongoing discussions and collaborations.

Subsequently, Jonathan engaged with Uwe and Leyla from HLRS to secure supercomputing data for sonification experiments, that will be later use as a possible use in the Spatial Audio aspect of the final piece, either by sonification means, or spatialisation of the sounds. Leyla provided sample datasets from the HAWK supercomputer—focusing on temperature, power consumption, and communication metrics. Uwe proposed an in-person session to experiment with real-time sonification of these datasets. Additionally, I consulted Nico Formanek, an HLRS philosopher, about organizing a discussion on the philosophy of voice and computing.

Jonathan’s interactions at the BSC were facilitated by Fernando, who introduced Sergio and Jofre, two researchers with strong technical/programming skills. Together, we conceptualized tools for real-time complex data sonification. Although initial attempts to find data at BSC were unproductive, and Sergio and Jofre eventually left the project, I remained involved in discussions with them about AI voice synthesis, linked to Maria’s project.  

The conversations have revolved around sonifying life science simulations from their research, and acquiring runtime data from the decommissioned MareNostrum4 for sonification purposes, with Thalia contributing significant insights due to her familiarity with the infrastructure.

Parallel to these collaborative efforts, Jonathan has been developing a real-time voice synthesis tool for “extended” vocal capabilities with software developer Victor Shepardson. Additionally, Jonathan made a research trip to IRCAM—a leading institution in music technology—to consult with researchers and better inform the development of a real- time extended voice AI instrument, and combine them with the research we have about the Spatial Audio aspect of the project.

DADAsets: reflection and thoughts on project by Jonathan Reus

“We have an ongoing collaboration/consultancy with In4Art on creating work contracts and consent forms suitable for the way we want to treat voice data in a value-aligned way with this project. This has taken a bit of a back seat the last months, but will be wrapped up before the end of October, specifically with regards to contracting work from the Intelligent Instruments Lab and with my vocal artist collaborators Jaap Blonk, and Irena Tomazin, so that I may use their voice data for a public release of Tungnaa models.”

“We are working with In4Art to develop a contract with the IIL (also including our payment obligation to Victor Shepardson) – so that the project can continued to be supported and co-maintained by the IIL, and potentially released to the public via the IIL’s existing presence on Huggingface, while Jonathan retains authorial control over how the software is presented, and how models trained using artistic voice datasets he has created can be distributed according to the wishes of the artists involved.”

“Dataset making as live art comes with many difficulties and problems to be solved. The first, and probably most obvious, issue is the problem of noise isolation. Traditionally you would like an audio dataset used for music tools to be recorded in as high quality as possible, in a controlled studio environment where the sound can be isolated. In a live situation this is not possible, so we had to experiment quite a bit with different microphones and recording setups. The best setup we found was to use a Sontronics Solo dynamic microphone, which out of the many microphones we tried had the best rejection of ambient noise (even better than the celebrated Shure SM7B). We ran this microphone directly into my audio interface for dataset recording (bypassing unpredictable and potentially noisy venue-specific audio equipment) – then passing the signal directly on to the live system. We also place Jaap outside of the sound field of the venue’s PA system (behind the speakers) – and use in-ear headphones for monitoring. In the future I could imagine designing an intelligent noise rejection system, such as used in noise-cancelling headphones, around the microphone – such a system does not exist yet to my knowledge.”

S+T+ARTS - Funded by the European Union

This project is funded by the European Union from call CNECT/2022/3482066 – Art and the digital: Unleashing creativity for European industry, regions, and society under grant agreement LC-01984767