listening to reinforcement learning

This experimental project investigated how people perceive behaviours of reinforcement learning through sound.

We led a controlled experiment where we asked participants to use feedback to guide three types of agents—one always exploring, one always exploiting, and one balancing exploiration with exploitation—, in two sound synthesis environment—one with linear timbral attributes, and one with a timbral obstruction.

Participants successfully interacted with these agents to reach a sonic goal. Subjective evaluations showed that agents’ exploration behaviour, rather than their learning to reach a sonic goal, was critical to how participants heard them as collaborative.

This project acted as a preliminary step to design an expressive workflow allowing humans to teach agents specific behaviours based on subjective preferences. This workflow was eventually implemented and evaluated within the Co-Explorer.


The project was developed with Frédéric Bevilacqua and Baptiste Caramiaux in collaboration with the ISMM group of IRCAM and the ex)situ group of LRI (INRIA), in the context of the Sorbonne Université Doctorate in Computer Science.

Paper at SMC (2018)

Ce diaporama nécessite JavaScript.