Introduction

Neural audio codecs are at the forefront of audio AI technology, leveraging neural networks to compress audio data to unprecedented levels while maintaining high perceptual audio quality. This capability allows for the efficient storage and transmission of high-quality audio through the encoding of signals into a compressed latent space. Audio codecs have played a crucial role in the advancement of music and industry, with recent innovations in neural codecs pushing the boundaries even further by providing enhanced compression techniques without compromising sound integrity.

While the primary goal of audio codecs is to preserve audio content within a compressed format, artists and musicians have explored the unique artifacts from lossy audio codecs like MP3 as a medium for artistic expression, utilizing tools such as MP3 glitching and databending. Building on this spirit, our project seeks to investigate the possibilities of using neural audio codecs for artistic applications beyond their original intent.

We have discovered that by manipulating the compressed representation within neural audio codecs, we can introduce unique sonic features that are difficult to achieve through traditional signal processing in sample space. This allows us to explore sonic characteristics that are distinct from the effects typically obtained in sample space. Particularly, operations such as mixing multiple audio tracks in the latent space, rather than the sample space, offer distinct sonic characteristics. Furthermore, injecting noise into the latent space has been found to produce unique audio textures. Currently, our ongoing research is exploring additional effects such as delay, saturation, and matrix multiplication within this latent space.

This project aims to explore the creative applications of neural audio codecs, focusing on how manipulating audio within the latent space can lead to distinct auditory effects. By experimenting with various types operations in the compressed domain, we seek to uncover new possibilities for sound design and music production, offering musicians and sound designers new tools to expand their creative palette and push the limits of what can be achieved in audio processing.

Overview

Untitled

Preliminary Results

We utilized a pretrained DAC model as neural audio codec to perform noise generation and mixing in the latent space.

Latent Noise Generator

Each knob is mapped to the codebook index of different codebook

IMG_3642.MOV

Latent mixing

Original

A

A