MaxMBROLA Project
A MBROLA-based real-time voice synthetizer for Max/MSP
[News] [Description] [MaxMBROLA~] [MIDI-MBROLA] [Applications] [Download] [Contacts]

The MaxMBROLA Project

Speech and singing both result from the same production system: the voice organ. However, the signal processing techniques developed for their synthesis evolved quite differently. One of the main reasons for this deviation is: the aim for producing voice is different for the two cases. The aim of speech production is to exchange messages. For singing, the main aim is to use the voice organ as a musical instrument. Therefore a singing synthesis system needs to include various tools to control (analyze/synthesize or modify) different dynamics of the acoustic sound produced: duration of the phonemes, vibrato, wide range modifications of the voice quality, the pitch and the intensity, etc. some of which are not needed in most of the speech synthesis systems. A pragmatic reason for that separation is that singing voice synthesizers target almost exclusively musical performances. In this case, "playability" (flexibility and real-time abilities) is much more important than intelligibility and naturalness. Discussions about various issues of singing synthesis can be found in [1] [2].

Our aim is to develop a flexible real-time application based on the MBROLA speech synthesizer [3] allowing performers to produce complex and versatile singing - as well as speech - in many languages. Thus, we start from a speech synthesizer and work on the adaptation of that system to real-time singing constraints. We use that particular approach for its high quality synthesis abilities.


Fig 1. Messages and values supported by the MaxMBROLA external object.

The main topics of this research project are:

  • The development of a flexible external object for Max/MSP (4.5) encapsulating the main features of the MBROLA speech synthesizer and the adaptation of the MBROLA functions to the asynchronous request-based architecture of the Max/MSP environment.
  • Discussions and Max/MSP developments about the real-time control issues in the phonetic/prosodic content generation process. This research topic is a good "first-trial" concerning overall issues of real-time manipulation of concatenation-based signals.
  • Propositions of various real-time concatenation-based applications (standalone, virtual instruments or Max/MSP patches) allowing performers to produce versatile voice with standard musical devices.

References

[1] X. Rodet and G. Bennet, "Synthesis of the Singing Voice," Current Directories in Computer Music Research, ed. M. V. Mathews and J. R. Pierce, MIT Press, 1989.

[2] X. Rodet, "Synthesis and Processing of the Singing Voice," Proceeding of the First IEEE Benelux Workshop on Model-Based Processing and Coding of Audio (MPCA-2002), Leuven, Belgium, 2002. 1984.

[3] T. Dutoit and H. Leich, "MBR-PSOLA : Text-toSpeech Synthesis Based on an MBE Resynthesis of the Segments Database," Speech Communication, no 13, pp. 435-440, 1993.

MaxMBROLA Project - Nicolas D'Alessandro, Raphäel Sebbe, Baris Bozkurt & Thierry Dutoit
Laboratoire de Théorie des Circuits et Traitement du Signal - FPMs
Last update : 27 / 06 /2005