Magyar beszéd Magyar beszéd
  • Language-text-speech
    • Mini dictionaryThe most important terms in speech technology and phonetics
    • Pronunciation dictionaryThe pronunciation of 1.5 million Hungarian words with sound symbols
    • Writing and speakingThe relationship between speech and writing, the phonetic transcript
    • Speech sound symbolsTables of IPA, SAMPA, and other sound symbol sets used in speech research
    • StatisticsLetter-, syllable-, word- and diphone combination statistics
  • Speech acoustics
    • BasicsA short overview of the basic topics of speech acoustics
    • Speech databasesDatabases of various speech representations for developments
    • Interactive programs6 interactive programs
    • Sentence melodiesHungarian sentence melody forms
    • Stress databaseStress of Hungarian words
  • Speech synthesis
    • 1980 - beginningsThe beginnings of Hungarian speech synthesis research, 1791-1989
    • 1988 - BME MultivoxBME TMIT Speech Technology Laboratory: Multivox family 1987-2002
    • 1996 - ProfiVox familyBME TMIT Speech Technology Laboratory: Profivox family 1998-2022
    • 2002 - FlexVoiceFlexible hybrid text to speech synthesiser by Mindmaker
    • 2020 - BME Neural ProfiVoxTechnology of the 21st century
  • Speech recognition
    • 1971 - the beginningsBeginnings in Hungary
    • 1990 - BME TMITEducation, research, developments
    • 2013 - Speech TexTechnology of the 21st century
  • Applications
    • TTSSpeech synthesis applications
    • Speech-to-text (SST)Machine speech recorder, subtitler
    • Talking headVirtual announcer and transparent articulation instructor
    • Magic box with ASRHelp children with speech and hearing disorder
    • ASR-TTSASR and TTS supported dialogue systems
  • Others
    • The maintainersWebsite creators and maintainers
    • History of the websiteShort history
    • EducationEducation related to speech technology
    • Downloadable literaturePublic books and articles
    • Related linksOther related links
    • Related softwareOther related software
    • ContactContact information

ProfiVox HMM

The Profivox- HMM TTS converter is based on statistical machine learning and uses hidden Markov models to generate parameters representing the speech signal to be synthesized. The development of computer technology made possible to realise this idea. No deep phonetic or linguistic knowledge is required. Speech melody and rhythm is also learned, no post signal-processing is required. The synthesized waveform is provided by the output of a speech encoder. The basis of learning is a large speech database (many hours of speech) created with several speakers. The algorithm determines the parameters for the middle speech sound of a quint-phone sequece step by step. It takes into account the time position (place) of the examined element at word- and sentence level, and also uses the word boundaries and the length of the word information during learning. As a result of learning an optimal parameter database is created, that is much smaller than the original speech database. Teaching process needs to be done only once. HMM-based teaching is a time-consuming and knowledge-intensive process. During the synthesis, Profivox-HMM selects data from the parameter database, based on the input text. The systm can pronounce declarative sentences and also questions correctly. The synthesis is fast and does not require much resources. You can slow down and speed up the speech. The advantage of this method is easy adaptation. It is possible to create a parameter database from the voice of another person. Only 10-20 minutes of newly recorded speech is enough for an adaptation. More details can be found here, in the summery of the PhD dissertation of Pál Bálit Tóth, who developed the system.

ProfiVox HMM voices

Listen to some synthesized voices in Hungarian

Mátyás
Your browser does not support the audio element.

Tamás
Your browser does not support the audio element.

Géza
Your browser does not support the audio element.

Gábor
Your browser does not support the audio element.

Kati
Your browser does not support the audio element.

Eszter
Your browser does not support the audio element.
Featured
  • Pronunciation dictionary
  • Speech synthesis applications
Downloads
  • Downloadable literature
About
  • Owners
  • Contact
Magyar beszéd Magyar beszéd

Copyright 2022. Olaszy Gábor és Abari Kálmán
Utolsó frissítés: 2022. 09. 01. (Last update: 01. 09. 2022)