Interactive Musical Improvisation with Magenta


We combine LSTM-based recurrent neural networks and Deep Q-learning for generation of musical sequences in real time. The role of the LSTM is to learn the general structure of music scores (encoded as MIDI, not audio). Deep Q-learning is used to improve and focus the generated sequences based on rewards such as desired genre, compositional correctness and ability to predict aspects of what the human collaborator is playing. This combination of RNN model-based generation with reinforcement learning is, to our knowledge, novel in the domain of music generation. This approach also yields more stable, musically-relevant sequences than LSTM alone. The networks are trained for two tasks: the generation of responses to short melodic inputs, and the generation of an accompaniment to melodic input in real time, requiring continuous prediction of future output. The addition of a novel MIDI interface on top of of TensorFlow enables improvisational experiences, allowing one to interact with the neural networks in real time. Our main goal is to have attendees know what it’s like to collaborate creatively with a machine learning model. We’ll have professional music equipment configured such that multiple attendees can play with Magenta using MIDI keyboards using a fast and responsive interface, that allows for call and response interaction, automatically generating an accompaniment to the user’s melody, or melody morphing: responding both with variations on the user’s melody and a bass accompaniment.

In Neural Information Processing Systems (NeurIPS) Best Demo
Natasha Jaques
Natasha Jaques

My research is focused on Social Reinforcement Learning–developing algorithms that use insights from social learning to improve AI agents’ learning, generalization, coordination, and human-AI interaction.