This project is ongoing as part of my current Postdoctoral research work at the Department of Electrical and Electronic Engineering in Trinity College Dublin. The project aims to integrate machine learning techniques into human-computer interaction and music technology contexts.
The initial portions of phase 1 involved building machine learning models to generate musical materials offline. This was carried out with Tensorflow and the Keras library in Python. My approach involved the use of LSTM networks to learn features from sets of MIDI data so that I could generate new musical sequences from those models. This approach built on previous work carried out by Sigurður Skúli Sigurgeirsson. This process involved acquiring a large number of MIDI files from a range of online sources across multiple musical styles. The MIDI files were cleaned to remove unwanted instrumentation etc. and prepared by mapping MIDI data (extracted with MIT’s Music21 toolkit) to integers and one-hot encoding the results. Results were generally mixed and much fine tuning was required to create interesting musical passages. Overall I found the generation of musical patterns to be of quite limited interest to me it removes much of what I find fun about the processes of making and playing music. Instead I began to explore some of the ML tools created by Google’s Magenta Project and became fascinated with the creative potential of ML technologies beyond the generation of simple musical patterns. As a result I became increasingly focused on the application of ML techniques in online and real-time human-computer interaction (HCI) contexts, an area where I believe ML techniques will have a major impact in the near future.
Phase 3 of this project is still ongoing. It aims to integrate work carried out in the first two phases of the project. I have adapted the gestural control system I designed during phase 2 to control the parameters of different sound synthesis routines. While phase 1 of the project was exploratory in nature and phase 2 was centred around developing a workable application, I have adopted a standard iterative development style in phase 3. This return to a structured HCI style research and development model is resulting in the production and refinement of a series or prototypes for the gestural control of sound synthesis parameters.
You can experience a stable prototype of these here. To use the prototypes stand in front of your webcam and move your hands to control sonic and visual parameters:
HCI Design. Interaction Design. UX Design. Visual Design. Interface Design. Sound Design.
Human-computer Interaction. Gestural Interfaces. Machine Learning. Embodied Cognition. Stephen Roddy.