Wednesday, February 6, 2008

Cyber Composer: Hand Gesture-Driven Intelligent Music Composition and Generation

This paper introduces us to a novel way of generating music according to the hand motions and gestures of the user, when the real instruments are not available. In order to have sync with the existing musical theories, the system has been embedded with the same, controlled by the motion sensing glove movements. The paper provides a nice overview of the musical theories and also how they have been incorporated them in their system.

The system consists of a Cyber Glove, with a 3D Polhemus sensor to each glove to track the 3D positioning, a background music generation module, a melody generation module and the main program. Keeping in mind the music standard in industry, authors have used MIDI format, which according to them, allows multiple instrument sounds and expressions. Apart from this, the availability for API’s to control input and output also was also an encouragement to use the MIDI format.

The system uses the sensory information from the gloves and they are sent to the Cyber composer main program. The main program transforms the signals into motion triggers which are used as the inputs to the various modules of the system. These inputs trigger the corresponding events in the modules, which are routed to the Musical interface connected to the standard MIDI output device, (Roland sound module in this case), which generates the music.

Authors have tried to keep their hand movements very intuitive to match the requirements of a lay person as well as the expert musician. In order to make the usage of the system smooth components were mapped to particular motions keeping in mind usage, frequency and flexibility of usage. For example:

  • Rhythm was mapped to wrist motion in right or left direction.
  • Pitch is mapped to relative height of the right hand to ground to the last note.
  • Dynamics of the melody note are controlled by the flexion of the right hand fingers
  • Volume is controlled by the extent of extension of the right fingers.
  • Cadence (end of the music) is controlled by bending of the left hand fingers completely.

In order to add another instrument, left hand is lifted up higher than the right hand which enables the dual instrument mode and the second instrument starts playing in harmony with the first instrument.

Discussion:

The paper started with many expectations but latter turned out to be bit disappointment. There is no mention of the details of the implementations. As per their description, the signals from the motions are used to trigger the events and any motion of the part of hand corresponding to the musical component will trigger an event. For example, even if wrist is moved a bit, which is unintentional, event will be triggered. There is no particular gesture associated, but just motion of the parts which is not a good approach. There is no training involved, so, I don’t understand how the motions of different users will correspond to the same rhythm and harmony. Also the gesture for pitch increase is confusing as I am not able to understand, how to musician will keep the track of the relative position, and even if he does by some means, how in dual mode he can control that as both hands are involved. Besides, i don't understand if this can be useful to musicians as they often tend to use many instruments to compose music and I don't see any other way to add more than two instruments. Also during some compositions, some side music is added which may not be in exact harmony with the major notes, I don't see any way to incorporate that effect. There are many flaws in the approach which may end up in another paper, so I am stopping here. I believe approach could be refined by using machine learning models and using some easy better gestures which are not conflicting and confusing.

No comments: