Stephen Roddy PhD

Logo


Portfolio documenting and showcasing some recent projects.

Return to my Website

Machine Learning Applications for Human-computer Interaction

Overview

This project is ongoing as part of my current Postdoctoral research work at the Department of Electrical and Electronic Engineering in Trinity College Dublin. The project aims to integrate machine learning techniques into human-computer interaction and music technology contexts.

Phase 1: Machine Learning for Music Generation

The initial portions of phase 1 involved building machine learning models to generate musical materials offline. This was carried out with Tensorflow and the Keras library in Python. My approach involved the use of LSTM networks to learn features from sets of MIDI data so that I could generate new musical sequences from those models. This approach built on previous work carried out by Sigurður Skúli Sigurgeirsson. This process involved acquiring a large number of MIDI files from a range of online sources across multiple musical styles. The MIDI files were cleaned to remove unwanted instrumentation etc. and prepared by mapping MIDI data (extracted with MIT’s Music21 toolkit) to integers and one-hot encoding the results. Results were generally mixed and much fine tuning was required to create interesting musical passages. Overall I found the generation of musical patterns to be of quite limited interest to me it removes much of what I find fun about the processes of making and playing music. Instead I began to explore some of the ML tools created by Google’s Magenta Project and became fascinated with the creative potential of ML technologies beyond the generation of simple musical patterns. As a result I became increasingly focused on the application of ML techniques in online and real-time human-computer interaction (HCI) contexts, an area where I believe ML techniques will have a major impact in the near future.


Phase 2: Computer Vision and Machine Learning for Gestural control of Unmanned Aerial Vehicles

The second phase of the project explored the application of machine learning techniques to real-time human-computer interaction contexts. My colleagues and I at the department of Electrical and Electronic Engineering TCD, designed and built a gestural control interface which could be used to control the flight of a an unmanned aerial vehicle (drone). The hardware required to communicate with the drone was designed and built by a colleague. I built the system gestural interface system HTML, Javascript and Node.js and used the p5.js and ml5.js libraries. I worked with another colleague to integrate the interface with hardware. The system allowed users to control the flight path of a drone through their hand movements. The users hand-movements are captured via camera/webcam and analysed using the ml5js Posenet implementation. These hand movements are then mapped to control the drone. The system was opened to the members of the public during the 2019 Trinity College Dublin Open Day.


Phase 3: Gestural Control of Sound Synthesis

Phase 3 of this project is still ongoing. It aims to integrate work carried out in the first two phases of the project. I have adapted the gestural control system I designed during phase 2 to control the parameters of different sound synthesis routines. While phase 1 of the project was exploratory in nature and phase 2 was centred around developing a workable application, I have adopted a standard iterative development style in phase 3. This return to a structured HCI style research and development model is resulting in the production and refinement of a series or prototypes for the gestural control of sound synthesis parameters.

You can experience a stable prototype of these here. To use the prototypes stand in front of your webcam and move your hands to control sonic and visual parameters:

Creative Skills

HCI Design. Interaction Design. UX Design. Visual Design. Interface Design. Sound Design.

Technical & Research Skills

HTML/CSS/Javascript. Python. Computer Vision. Machine Learning. Keras. Music21. MIDI. Sound & Music Computing. Serial. Data Analysis. Audio Engineering. Audio DSP. Sound Synthesis. Data Analysis. User Evaluation.

Tags

Human-computer Interaction. Gestural Interfaces. Machine Learning. Embodied Cognition. Stephen Roddy.