Singh*, N., Cherep*, M., & Shand, J. (2023, December). Creative Text-to-Audio Generation via Synthesizer Programming. In NeurIPS Machine Learning for Audio Workshop

Abstract

Sound designers have long harnessed the power of abstraction to distill and highlight the semantic essence of real-world auditory phenomena, akin to how simple sketches can vividly convey visual concepts. However, current neural audio synthesis methods lean heavily towards capturing acoustic realism. We introduce an open-source novel method centered on meaningful abstraction. Our approach takes a text prompt and iteratively refines the parameters of a virtual modular synthesizer to produce sounds with high semantic alignment, as predicted by a pretrained audio-language model. Our results underscore the distinctiveness of our method compared with both real recordings and state-of-the-art generative models.

via NeurIPS Machine Learning for Audio (Workshop)

Creative Text-to-Audio Generation via Synthesizer Programming

Topics

People

Projects

Groups

Abstract

SynthAX: A Fast Modular Synthesizer in JAX

Manuel Cherep selected for 2024 "la Caixa" Fellowship

Contrastive Learning from Synthetic Audio Doppelgängers

Opera of the Future student Jessica Shand premieres "Transmutations," a live musical performance

Creative Text-to-Audio Generation via Synthesizer Programming

Topics

People

Projects

Groups

Share this publication

Abstract

SynthAX: A Fast Modular Synthesizer in JAX

Manuel Cherep selected for 2024 "la Caixa" Fellowship

Contrastive Learning from Synthetic Audio Doppelgängers

Opera of the Future student Jessica Shand premieres "Transmutations," a live musical performance