ISCA Eastern Europe Sub-Committee Report

Sub-Committee on Eastern Europe

ISCA International Affairs Committee

Report 2008

I. Introduction

The ISCA (“International Speech Communication Association”) International Affairs committee has established the Eastern Europe regional Sub-Committee to promote ISCA and, more generally, the Research and Development in Speech Communication in the region. The Sub-Committee was established in 2006 and includes the following members:

▪ Prof. Rodmonga Potapova (Russia, Moscow)

▪ Dr. Andrey Ronzhin (Russia, St.-Petersburg)

▪ Prof. Taras Vintsiuk (Ukraine)

▪ Prof. Boris Lobanov (Belarus)

▪ Dr. Catalin Grigoras (Romania)

▪ Prof. Edward Shpilewski (Poland)

▪ Prof. Jozef Juhar (Slovakia)

▪ Prof. Slobodan Jovicic (Serbia)

▪ Prof. Dimitar Popov (Bulgaria)

II. Research Activities in the Region

By the present moment local reports have been received from Moscow (Prof. R. Potapova), St.-Petersburg (Dr. A. Ronzhin), Slovakia (Prof. J. Juhar), and Belarus (Prof. B. Lobanov). Unfortunately, other members of the Sub-Committee have not yet submitted their reports due to the lack of time. Information on scientific institutions and activities in their regions is being collected.

(A) Moscow Area

Here is some preliminary information regarding the current state of speech science in Moscow.

National research grant resources/agencies

• RFFI (Rossijsky fond fundamental’nyh issledovanij) – In English: RFBR, Russian Fund for Basic Research (). RFFI supports basic research.

• Ministry of Education and Science supports research and development through specialized research programs.

Moscow State Linguistic University

Moscow State Linguistic University (Ostozhenka 38, Moscow, 119992) works in cooperation with other scientific institutions in different countries including the Unversity of Halle (Germany), the University of Patras (Greece). The sphere of interest of Moscow State Linguistic University comprises problems pf linguistics, phonetics, speech recognition, speech synthesis, natural language processing, speech corpora, spoken language dialogue systems, forensic phonetics.

The Speech research group of Moscow State Linguistic University (Department of Applied Linguistics, Chair of Applied and Experimental Linguistics, Centre of Fundamental and Applied Speechology ) includes 10 members (Potapova, Mikhailov, Zhenilo, Sobakin, Khitina, Bobrov, Isakova, Loseva, Nikolaeva, Statsenko, Dorofeev, Arkhipov). The leader of this group is Prof. Rodmonga Potapova.

International Conferences/Workshops organized by Moscow State Linguistic University

|Event |Section titles |

|SPECOM’2003 |Linguistic data corpora of human-computer dialogues |

|Moscow, October 27-29, 2003 |Speech signal processing and text-to-speech systems |

| |Speech recognition and understanding systems |

| |Forensic phonetics and speaker identification technologies |

| |Speech production and speech perception |

| |Computer-mediated language learning |

|SPECOM’2005 (in cooperation with the University of Patras, Greece) |Speech Recognition |

|Patras, October 17-19, 2005 |Speech Production and Perception |

| |Speech Analysis and Processing – Coding |

| |Speech Synthesis |

| |Speaker Recognition |

| |Spoken Dialog Systems |

| |Multimodal (eg Audio-Visual) Systems |

| |Natural Language Processing |

| |Dialog Systems |

| |Speech and Language Resources |

| |Applications (includes application aspects all of the above plus |

| |machine translation) |

The international conferences SPECOM (Speech and Computer) are conducted yearly and organized by either St.-Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (even years) or Moscow State Linguistic University (odd years).

Speech studies (mainly connected with speech acoustics, speech production and computer analysis of speech) are also presented at annual sessions of Russian Acoustic Society. The XVIII Session of Russian Acoustical Society (N.N. Andreyev Acoustics Institute of the Russian Academy of Sciences) will be held in September, 2006 in Taganrog. The previous session of RAS (2005) took place in Nizhny Novgorod. At these sessions, speech studies are presented in a special section titled “Speech Acoustics”.

Research projects conducted by Moscow State Linguistic University

Projects conducted by Moscow State Linguistic University are supported by the Ministry of Education and Science of Russia and Russian Fund for Basic Research.

Here is an example of abstract of a research project supported by Russian Fund for Basic Research, year 2005.

Project title: “Fundamental research of the phenomenon of the segmentation spoken language on the basis of cognitive reflection and communication working tools”

Abstract: In the framework of the research performed, the following aspects were analyzed: modern concepts of interpersonal communication; approaches to its research; basic communicative items; strategies, tactics and methods used in the processes of speech coding and decoding concerning dialogues (polylogues). Special attention was paid to the problems of semantic and quasisemantic macrosegmentation of spoken language (according to Potapova R.K.).

Special features of communication were examined considering processes of speech segmentation, including colloquial speech, as well as different segmentation methods on the basis of approaches to analyzing the organization of dialogue and its parts and items. The approach used as basic of this research implies considering communicative, cognitive, sociolinguistic and culturologic components.

Interpersonal communication seems to be a complex communicative phenomenon including verbal, para- and extraverbal channels that interact differently in different communication situations.

One can suppose as principal hypothesis that the final interpretation by the addressee of absolutely verbal, para- and extraverbal information is a process of decoding of a given complex that depends on the linguistic and communicant behavior models, while speech understanding is an active process in which the receiver uses different sources of information [Potapova R.K.].

The analysis of different verbal dialogue segmentation concepts of required examining of the following aspects: structural organization of the dialogue (extraction of its basic structural and semantic phases); segmentation into conversational remarks (analysis of the communicants’ communicative behavior including the handing over of initiative in the dialogue); topical organization of the dialogue (considering its characteristic features in texts of the colloquial spoken language – multitopicality and, in this connection, special means providing the dialogue integrity); definition of macrosegmentation items of spoken dialogues compared with ones of spoken monologues.

As a basic operational item of the macrosegmentation analysis a “segment” is taken (a fragment of speech between two perceptually marked boundaries). Moreover, different linguistic markers were as well examined, which are used for different segmentation types (system of pauses, discourse words, particles, etc.).

In the process of the research an experimental corpus of authentic sound records was formed on the material of Russian language as a basic one as well as of English and German languages. The corpus data was subsequently digitized. Methods of perceptual-acoustic and perceptual-visual analysis were elaborated (regarding verbal, para- and extraverbal channels of communication).

(B) St. Petersburg Area

About St.Petersburg

St. Petersburg is located on the North-West of Russia and easy reached from Europe. St. Petersburg straddles the mouth of the Riva Neva at the easternmost spur of the Baltic. Every summer, a unique and wonderful thing known as the White Nights happens in St. Petersburg. It is the world's only metropolis where such a phenomenon takes place. Every year there are days when the downtown St Petersburg is full of people, even at night. When bridges over the Neva river are drawn St. Petersburg clear off the touch of everydayness and its authentic image appears. During the night excursion history of each architectural ensemble or monument seems especially important.

Few words about research

Disregarding the deficit financing of the Russian science at the end of last century scientific organisations managed to save its research potential, and practically did not slow down the development pace. The most significant results were achieved by scientific groups, which have possibility to acquire or development of large Russian vocabulary and speech corpora. Among St. Petersburg scientific groups, which deal with speech processing, it should be mark Saint-Petersburg State University, Speech Technology Center, St. Petersburg Institute for Informatics and Automation the Russian Academy, Saint Petersburg State University of Aerospace Instrumentation, Saint Petersburg Electrotechnical University, AudiTech, IstraSoft, etc. Moreover Special Interest Group on Formal Methods in the Analysis of Russian Speech is being created under ISCA . The success of the development of speech technologies mainly deal with study of language specifics, so the research should carry out immediately in several neighboring scientific areas: linguistics, phonetics, digital signal processing, information theory, computer science, etc.

St. Petersburg research grant resources/agencies

Committee on Science and Higher Education of the Administration of St.Petersburg

Saint-Petersburg Scientific Center of Russian Academy of Sciences

Foundation for suppport of education and science „Alferovskiy Fond“

The main research groups/institutions

|Institution |Domain of Interest |

|St. Petersburg Institute for Informatics and Automation the Russian |speech recognition; speaker verification; biomonitoring by voice; |

|Academy, Russia, 199178, St. Petersburg,14 Line, 39, |ontology based natural language description; multimodal interfaces. |

| | |

|Institute of Philological Research of Saint-Petersburg State |phonetics, speech synthesis; speech production and perception; |

|University, 199034 11 Universitetskaya emb., St. Petersburg, Russia,|speech recognition. |

| | |

|Saint Petersburg State University of Aerospace Instrumentation, |speech compression; digital speech processing. |

|190000, Saint-Petersburg, Bolshaya Morskaya street, 67, | |

| | |

|Saint Petersburg Electrotechnical University, Professor Popov str. |human-computer interaction; speech processing. |

|5, St. Petersburg, 197376, RUSSIA, | |

| | |

|Speech Technology Center, 4 Krasutskogo St. St. Petersburg, 196084, |noise cancellation and speech enhancement; voice recording; speech |

|Russia, |analysis and speaker identification; speech documentation; speech |

| |technologies for developers. |

|AudiTech, Ltd, |automatic speech recognition; speaker identification/verification; |

| |speech compression; speech & lexical databases. |

|IstraSoft Ltd., |speech recognition; speech compression. |

Research projects granted by National Program for R&D


Russian Foundation of Basic Research Project # 07-07-00073-a “Investigation of multimodal interaction by an information kiosk”, 2007-2009.

Development of test bed model for multimodal system for computer control will be carried out. An applied model for information inquiry service will be realized on the test bed model. Cognitive aspects of user interaction considering different ways of communication will be investigated during testing exploitation of the information inquiry service in Wizard of Oz mode. The accumulated material will be used for studying cognitive and behavior characteristics of users and further optimization of the multimodal interface.

Russian Foundation for Basic Research Project # 08-08-00128-a “Modeling non-phonemic speech elements and creation of alternative transcriptions for spontaneous Russian speech recognition”, 2008-2010.

The project has two main tasks: (1) modeling non-phonemic speech elements for improvement of quality of ASR; (2) creation of alternative transcriptions for spontaneous Russian speech recognition.

Russian Foundation for Basic Research Project # 08-07-90002-Bel_a „Model of Audio-Visual Speech Synthesis and Recognition for Intellectual Queuing Devices“, 2008-2009.

The bilateral research project with United Institute of Informatics Problems (Belarus) is aimed for research and developement of the models for audio-visual speech synthesis and recognition of the Russian language.

OITVS RAS Project “Investigations of principles of speech dialogue management in infotelecommunication applications”, 2007 - 2011

Development of methodology and models for intellectual speech interface based on integral use of diverse knowledge about language and extra-linguistic knowledge. Analysis of the state-of-the-art in the area of speech human-computer interface in the infotelecommunications and the choice of the methods for decision of the main objectives of the project. Speech interface will be developed based on integral conception of automatic speech understanding and using modern multimedia technologies. Supposed use of results: information retrieval systems of diverse kinds with voice input-output, systems of electronic commerce, systems for interaction with Internet, systems for spoken machine translation, voice control for technical equipment, training systems, etc.

Grant of the Committee on Science and Higher Education of the Administration of St.Petersburg # 30-04/132 “Development of a model for distant recording of speech of a user of the intellectual kiosk”, 2008

The aim of the project is to improve the quality of recording of speech of a user, while interacting with the intellectual multimodal kiosk, by application of the methods for passive localization of sound sources and further filtering of an useful speech signal, incoming from a working zone in front of the kiosk.

Project of the the Committee on Science and Higher Education of the Administration of St.Petersburg # 30-04/131 “Development of a bi-modal system for audio-visual recognition of continuous Russian speech”, 2008.

The aim of the project is development of mathematical models and algorithms for the system of audio-visual recognition of Russian speech that uses simultaneously both a state-of-the-art technology for processing of acoustical speech signals and a computer vision technology for automatic lipreading.

Participation in EU-funded research projects


• INTAS Ref. No 05-1000007-426, Introduction the automatic Russian speech recognition system SIRIUS in telecommunication, 2006-2008.

Local conferences/workshops/seminars

XXXVII International Philological Conference, St. Petersburg State University 11-15 March 2008.

2 interdisciplinary workshop „Conversational Russian Speech Analysis“, St. Petersburg, Russia, 27-28 August 2007.

(C) Belarus Area

About Belarus

The Republic of Belarus is situated in the centre of Europe. Within its territory are laid the shortest transport communications connecting the CIS countries with the states of West Europe. Belarus shares common frontier with Poland, the Baltic States, Russia and Ukraine. The territory of Belarus is 207, 000 sq. km., population - about 10 millions and 70% of them lives in cities. The population of Minsk city, the capital of Belarus is about one fifth of the country population. In accordance with the political division Belarus consists of six regions. The state languages are Belarusian and Russian. The most popular languages of business communication are Russian, English and German.

Belarus is a country with highly developed science and advanced education. The adult literacy rate in Belarus is 99.7%, and the country has a high primary, secondary and tertiary gross enrolment ratio (the highest in the CIS countries according to UNESCO Institute for Statistics). The ratio of university students (303 per 10,000 people) is the level of leading European countries, including Germany, the Netherlands, Sweden, and Finland.

The National Academy of Sciences of Belarus (NAS of Belarus, or NASB) is a higher state scientific organization of The Republic of Belarus. Established on January 1, 1929, the NASB is the guiding research center of Belarus, which unites the highly-skilled scientists of different specialties and dozens of scientific research organizations. As of 01.01.2006, the staff of the Academy of sciences has included some 15,310 researchers, technicians and supporting personnel. There are about 560 Doctors of Sciences (equivalent to Prof.) and some 1,970 Candidates of Sciences (equivalent to Ph.D.) among them.

The main research and development institutions

|Institution |Division (group) |Key person |Domain of Interest |

|United Institute of Informatics Problems|Speech Synthesis and | |High quality text-to-speech (TTS) |

|of the National Academy of Sciences of |Recognition Laboratory |Prof. Boris Lobanov |synthesis; |

|Belarus (UIIP NAS of Belarus) |(SSRLab) | |Personal voice and spiking manner cloning |

| | | |by TTS; |

|. | | |Multi-languages TTS; |

| | | |Isolated and continuous speech recognition;|

| | | |Computer telephony applications; |

| | | |Speech-based system for blind. |

|Belarussian State University of |Computer Science department | |Speech and audio coding; |

|Informatics and Radioelectronics | |Prof. Alexander Petrovsky |Noise reduction; |

| | | |Acoustic echo cancellation; |

| | | |Robust speech recognition; |

| | | |Real-time signal processing. |

|Minsk State Linguistic University |English phonetics department | |Experimental phonetics; |

| | |Prof. Elena Karnevskaya |Prosody modeling; |

| | | |TTS synthesis. |

|Sakrament Ltd | |Valery Egorov |Speech synthesis; |

| | |office@ |Speech recognition; |

| | | |Voice identification; |

| | | |Audio indexing. |

|Kvintel Ltd | | |Speech technologies; |

| | |Sergey Nikiforov |Speech recognition and scrambling in |

| | | |communication channels; |

| | | |Bank automated information systems |

| | | |Systems of audioinformation collection, |

| | | |control and storage; |

| | | |Computer-aided systems for tariffing of |

| | | |phone calls. |

|Speech Technology Center |Minsk branch |Vitali Kiselev |Speech Recognition; |

| | | |Speech Synthesis of Russian; |

| | |kiselev-v@ |Keyword Spotting in Speech; |

| | | |Speaker Identification. |

Research projects granted by National Program for R&D

Belarussian Foundations for Fundamental Research Project # F08P-016 „A Model of Audio-Visual Speech Synthesis and Recognition for Intellectual Queuing Devices“, 2008-2010.

The bilateral research project with Saint-Petersburg Institute for Informatics and Automation of RAS (Russia) is aimed for research and developement of the models for audio-visual speech synthesis and recognition of the Russian language.

Local conferences/workshops/seminars

Scientific Readings devoted to professor V.A.Karpov memory, Belarusian State Univercity, March 17-18, 2008, Minsk.

The Fifth International Conference on Neural Networks and Artificial Intelligence, Belarusian State University of Informatics and Radioelectronics, May 27-30, 2008, Minsk.

Participation in International Conferences and Editorial Boards of Speech Technologies Journals

1. Prof. Boris Lobanov and Prof. Alexander Petrovsky are the members of editorial board of new-founded journal “Speech Technology” (Moscow, Russia).

2. Scientific Readings devoted to professor V.A.Karpov memory, Belarusian State Univercity, March 17-18, 2008, Minsk

a. B. Lobanov, E. Karnevskaya, L. Tsirulnik, O. Jeliseeva, J. Hetsevitch. Function- Semantical Attributes of Particles as Indicators of Intonation Variants for TTS-synthesis

3. The Fifth International Conference on Neural Networks and Artificial Intelligence ICNNAI’2008, May 27-30, 2008, Minsk, Belarus.

a. A. Karpov, B. Lobanov, A. Ronzhin, L. Tsirulnik. Audio-Visual Russian Speech Recognition and Synthesis for a Multimodal Information Kiosk.

4. Annual International Conference Computational Linguistics and Intellectual Technologies Dialog’2008, June 4-8, 2008, Moscow, Russia.

a. L. Tsirulnik., B. Lobanov, O. Sizonov. An Algorithm of Intonational Tagging of Declarative Sentences for TTS-synthesis

b. B. Lobanov, L. Tsirulnik, O. Sizonov. “IntoClonator” – the Computer System for Personal Prosodic Characteristics Cloning

c. B. Lobanov. An Algorithm of Text Segmentation on Syntactic Syntagmas for TTS Synthesis.

5. International Conference on Audio, Language and Image Processing ICALIP’2008, July 7-9 2008, Shanghai, China.

a. Al. Petrovsky, X. Yu. Auditory Adaptive Frame-based WPD and Its Applications.

b. M. Parfieniuk, A. Petrovsky, W. Wan Frequency Warping and Subband Merging for Approximating the Critical Bands with Cosine-modulated Filter Banks.

6. 16-th European Signal Processing Conference EUSIPCO, August 25-29 2008, Lausanne, Switzerland.

a. A. Petrovsky. Estimation of the Instantaneous Harmonic Parameters of Speech.

7. 2-nd interdisciplinary workshop „Conversational Russian Speech Analysis“, August 27-28 2008, St. Petersburg, Russia

a. B. Lobanov, L. Tsirulnik. An Automation of Analysis of Prosodic Characteristics for Experimental Research and TTS-synthesis

8. International Conference Speech Analysis, Synthesis and Recognition, Applications in Systems for Homeland Security SASR’2008, September 8-12, 2008, Piechowice, Poland.

a. B. Lobanov, L. Tsirulnik, A. Ronzhin, A. Karpov. A Model of Personalized Audio-Visual TTS-synthesis for Russian.

(D) Slovakia Area

About Slovakia

Slovakia (Slovensko in Slovak) is a landlocked republic in Central Europe. It borders the Czech Republic and Austria in the west, Poland in the north, Ukraine in the east and Hungary in the south. Slovakia is a member of the European Union and has a population of more than five million. The capital city is Bratislava.

The majority of the inhabitants of Slovakia are ethnically Slovak (86 %). Hungarians are the largest ethnic minority (10 %) and are concentrated in the southern and eastern regions of the country. Other ethnic groups include Roma, Czechs, Ruthenians, Ukrainians and Germans.

The official state language is Slovak, a member of the Slavic Language Family, but Hungarian is also widely spoken in the south of the country and enjoys a co-official status in some municipalities. Many people also speak Czech.

Slovakia is a member state of the European Union, NATO, OECD, WTO, and other international organizations. It joined the European Union in 2004 and will join the Eurozone on 1 January 2009.

About research

Research and development of „classical“ linguistcs and phonetics has a long-term continuance in Slovakia. It has been concentrated predominantly in Slovak Academy of Science (SAS), Comenius University in Bratislava and Safarik University in Kosice.

Research in computer linguistics, computer based speech processing and human-computer spoken language communication has been extended also to technically oriented universities and institutes of SAS (Technical University in Košice, Institute of Informatics SAS, Slovak Technical University in Bratislava, University of Žilina, ...).

There are no private institutes doing research and/or development in speech and language technology on commercial base.

The main research groups/institutions

|Institution |Domain of Interest |

|Technical University in Kosice |speech recognition, natural language processing, speech corpora, |

|Letna 9, Kosice 042 00 |spoken language dialogue systems |

|Department of Electronics and Multimedia Communications, | |

|(), | |

|Department of Cybernetics and Artificial Intelligence, | |

|( ) | |

|Slovak Technical University in Bratislava, Department of |speech recognition, speech synthesis |

|Telecommunications, Ilkovičova 3, Bratislava 812 19 | |

|( ) | |

|University of Zilina |speech recognition, semantic analysis of speech and audio, speech |

|Univerzitna 1, 010 26 Zilina |synthesis |

|Department of Telecommunications, ( ) | |

|Department of information networks () | |

|Institute of Informatics, Slovak Academy of Science, Dubravska cesta|speech analysis, speech synthesis |

|9, 845 07 Bratislava ( ) | |

|Ľudovít Stur Institute of Linguistics, Slovak Academy of Science, |linguistics, computer linguistics, corporas and resources |

|Panska 26, Bratislava 813 64 ( ) | |

|Comenius University, Institute of Applied Informatics, Mlynska |speech recognition, natural language processing |

|Dolina, Bratislava 842 48 ( ) | |

|University of Presov, Department of Linguistics and phonetics, Ul. |linguistics, phonetics |

|17. novembra 1, Presov 080 78 | |

National research grant resources/agencies

VEGA (VEdecká Grantová Agentúra) – In English: Research Grant Agency. The VEGA supports basic research.

APVV (Agentúra na Podporu Vedy a Výskumu) – In English: Agency for Suppport of Science and Research. The APVV supports basic research and applied research and development ( ).

Slovak Ministry of Education supports research and development through other specialized research programs, like National Program for R&D, program for support of international bilateral colaboration, etc.

National research projects fully or partially devoted to speech and language

National research projects

Project title: The Slovak national corpus – 2nd Phase

Project type: National Program for R&D funded by Slovak Ministry of Education

Project duration: 2007-2011

Project leader: PhDr. Mária Šimková, Ludovit Stur Institute of Linguistics, Slovak Academy of Science, []

Project title: Speech technologies for advanced telecommunication and information services in Slovak language

Acronym: SPEETIS

Contract number: APVV-0369-08

Project duration: 2008-2010

Project leader: Juhár, Jozef, doc., Ing., PhD., Technical University in Košice

Abstract: Project goals are focused on speech technologies for advanced, voice operated telecommunication and information systems and services with potential impact of research results on other areas like automatic speech transcription, searching in speech and audio record databases, speech-to-speech translation, semantic web etc. The goals of the project are strictly aimed at Slovak language and research of robust and large vocabulary continuous speech recognition, concatenative and corpus speech synthesis and advanced dialogue modeling and management for spoken dialogue systems. An originality and innovativeness of the proposed tasks lies in technical (speech and text corpuses), theoretical (design of new algorithms and procedures) and knowledge (linguistic, acoustic, phonetic, phonological, prosodic and psychoacoustic) pre-requisites, which will further improve a naturalness and reliability of the man-machine speech interface in Slovak

Project title: Robustné rečové technológie pre informačné systémy v slovenčine a ich diagnostika

Contract number: VEGA 2/0138/08

Project duration: 2008-2010

Project leader: Cerňak Miloš, Ing., PhD.,Institute of Informatics, Slovak Academy of Science

Project title: Automatic analysis, recognition and transcription of audio recordings

Contract number: AV 4/2016/08

Project duration: 2008-2010

Project leader: Juhár, Jozef, doc., Ing., PhD., Technical University in Košice

Project title: Intelligent terminal - IQ Kiosk

Contract number: AV 4/0020/07

Project duration: 2007-2009

Project type: Applied research funded by Slovak Ministry of Education

Project leader: Rozinaj Gregor, doc.,Ing.,PhD., Slovak Technical University in Bratislava

Project title: Automated voice interactive telecommunication systems and applications

Contract number: AV 4/0006/07

Project duration: 2007-2009

Project type: Applied research funded by Slovak Ministry of Education

Project leader: Juhár, Jozef, doc., Ing., PhD., Technical University in Košice

Project title: Knowledge based system for cognitive robots

Project duration: 2007-2009

Contract number: VEGA 1/4060/07

Project leader: Dusan Guller, RNDr., PhD., Comenius University, Bratislava

Project title: Nonlinear processing of multimedia signals for telecommunication applications

Project duration: 2006-2008

Contract number: VEGA 1/3110/06

Project leader: Rozinaj Gregor, doc.,Ing.,PhD., Slovak Technical University in Bratislava

Project title: Evaluation of automated voice interactive telecommunication systems and applications

Project duration: 2007

Project type: bilateral cooperation with Slovak Telecom

Project leader: Juhár, Jozef, doc., Ing., CSc., Technical University of Košice

Project title: The Creating of Parallel Corpora (Slovak-Croatian and Slovak-Russian Corpus)

Project type: basic research funded by national scientific agency VEGA

Contract number: VEGA 2/5053/25

Project duration: 2005-2007

Project leader: Radovan Garabík, Ľudovít Stur Institute of Linguistics SAS in Bratislava

Project title: Mobile Multimodal Telecommunications Systems and Services


Contract number: APVT-20-029004

Project type: basic research funded by national scientific agency APVV

Project duration: 2005-2007

Project leader: Anton Čižmár, prof., Ing., PhD., Technical University of Košice

Project title: Modern speech processing technologies in Slovak utilising speech corpora

Project type: basic research funded by national scientific agency VEGA

Contract number: VEGA 2/5124/27

Project duration: 2005-2007

Project leader: Milan Rusko, Ing., Institute of Informatics, Slovak Academy of Science

Project title: The smart speech communication interface

Project type: National Program for R&D by Slovak Ministry of Education

Project duration: 2003-2006

Project leader: Juhár, Jozef, doc., Ing., PhD., Technical University in Košice

Abstract: The first Slovak spoken language dialogue system has been developed in the frame of the project. The SLDS enables multi-user interaction in Slovak language. The dialogue system is based on the DARPA Communicator architecture. The proposed system consists of the Galaxy hub and telephony, automatic speech recognition, text-to-speech, backend, and VoiceXML dialogue management modules. SpeechDat comptibile MobilDat-Sk speech databasesare has been created for training acoustic models. The funcionality of the SLDS is demonstrated and tested via two pilot applications, „Weather forecast for Slovakia“ and „Timetable of Slovak Railways“. The required information is retrieved from Internet resources in multi-user mode through PSTN, ISDN, GSM and VoIP network. .

Project title: The Slovak national corpus

Project type: National Program for R&D by Slovak Ministry of Education

Project duration: 2002-2005

Project leader: Mária Šimková, Ľudovít Stur Institute of Linguistics SAS in Bratislava

Abstract: The Slovak national corpus is a database of contemporary Slovak language texts, covering broad range of language styles, with additional linguistic information and a powerful query system. The Corpus is offered to the public for research, educational, and other strictly non-comercial purposes. It will be expanded continuously.

Project title: Using of speech communication interface by children education in Slovak

Project duration: 2006

Project leader: Marek Nagy, RNDr., Comenius University, Bratislava

Project title: Advanced Data Driven Methods for Speech Processing

Project type: basic research funded by national scientific agency VEGA

Contract number: VEGA 2/2087/22

Project duration: 2002-2004

Project leader: Milan Rusko, Ing., Institute of Informatics, Slovak Academy of Science

Project title: Continuous Speech Recognition System Based on Hybrid Neural Networks

Project type: basic research funded by national scientific agency VEGA

Contract number: VEGA 1/3205/96

Project duration: 1996-1998

Project leader: Dušan Krokavec, prof., Ing., PhD., Technical University of Košice

Participation in international and/or EU-funded research projects

Project title: Intelligent Language Tutoring System (ILTS) with multimodal feedback functions


Project number: 135379-LLP-1-2007-1-DE-KA2-KA2MP

Project start: 01-11-2007

Project end: 31-10-2009

Local coordinator: Milan Rusko, Ing., Institute of Informatics, Slovak Academy of Science Abstract: The main objective of the EURONOUNCE project is the development of a language tutoring system supporting the acquisition of the correct pronunciation of different target languages. The project focuses on Slavonic target languages Russian, Polish, Slovak and Czech.

Project title: Mainstreaming on AMbient Intelligence

Acronym: MonAMI

Contract number: IP-035147

Project type: An Integrated Project under the European Commission’s FP6

Duration: 09/2006 – 08/2009

Local coordinator: Dušan Šimšík, prof., Ing., PhD., Technical University in Košice

Abstract: The overall objective of MonAMI is to mainstream accessibility in consumer goods and services, including public services, through applied research and development, using advanced technologies to ensure equal access, independent living and participation for all in the Information Society. Usability of speech technologies in tested in the frame of the project.

Project title: Intelligent Information System Supporting Observation, Searching and Detection for Security Citizens in Urban Environment

Acronym: INDECT

Contract number: The project has been approved for financing

Project type: An Integrated Project under the European Commission’s FP7

Duration: 4 years

Local coordinator: Ľubomír Doboš , doc., Ing., PhD., Technical University in Košice

Abstract: The goal of the project is to create an intelligent information system that will provide a high level of security for population in urban areas. The system will collect and analyze various kinds of information from a monitored area, detect unnatural behavior and situations, and inform the crew manning the system of occurring activities. The analysis of environment will be performed, among others, by a network of sensors that will be able to detect changes in location of objects, sudden changes in temperature, density of dangerous substances, etc. Among other thins the system will also be able to analyze audio information from microphones placed in the protected area.

Project title: Cross-Modal Analysis of Verbal and Non-verbal Communication,

Project duration: 2006-2009

Project type: COST Action 2102

Local coordinator: Čižmár Anton, prof., Ing., PhD., Technical University of Kosice

Project title: Grapheme- and Phoneme Frequency Occurrence in Slovak

Project duration: 2007-2008

Project type: bilateral cooperation with Karl-Franz University of Gratz, Austria

Local coordinator: Milan Rusko, Ing., Institute of Informatics, Slovak Academy of Science

Project title: Spoken language interaction in telecommunications

Project duration: 2001-2005

Project type: Cost Action 278

Local coordinator: Anton Čižmár, prof., Ing., PhD., Technical University of Košice

Project title: Nonlinear speech processing

Project duration: 2001-2005

Project type: Cost Action 277

Local coordinator: Jozef Juhár, doc., Ing., PhD., Technical University of Košice

Project title: Eastern European Speech Databases for Creation of Voice Driven Teleservices

Acronym: SpeechDat(E)

Project duration: 1998-2000

Project type: INCO-Copernicus

Contract number: Nr 977017

Local coordinator: Milan Rusko, Ing., Institute of Informatics, Slovak Academy of Science

Project title: Continuous speech recognition over the telephone

Project duration: 1994-2000

Project type: Cost Action 249

Local coordinator: Anton Čižmár, prof., Ing., PhD., Technical University of Košice

Project title: Spoken Queries in European Languages

Acronym: SQEL

Project duration: 1994-1996

Project type: Copernicus

Contract number: Nr 1634

Local coordinator: Dušan Krokavec, prof., Ing., PhD., Technical University of Košice

Local conferencies/workshops/seminars on speech and language technology

Speech processing and acoustics

The 1st National Seminar on Speech and Acoustics

November 29-30, 2007

Slovak Academy of Science, Bratislava, Slovakia

Slovko 2007

The 4th International Seminar on „Natural language processing, Computational Lexicography and Terminology“,

October 25-27, 2007,

Bratislava, Slovakia


Slovko 2005

The Third International Seminar on „Computer Treatment of Slavic and East European Languages“,

November 10-12, 2005

Bratislava, Slovakia


Slovko 2003

The Second International Seminar "Computer Treatment of Slavonic Languages"

October 24-25, 2003

Bratislava, Slovakia

Slovko 2001

The International Seminar "Computer Processing of Czech and Slovak language"

October 26-27, 2001

Bratislava, Slovakia

International conferencies with speech/language processing session


15th International Conference on Systems, Signals and Image Processing

June 25-28, 2008

Bratislava, Slovakia

Sessions: Speech and Audio I, II, III



18th International Conference

April 23-25, 2008

Prague, Czech Republic

Acoustics High Tatras 06

33rd International Acoustic Conference – EAAA Symposium,

October 4-6, 2006,

Strbske Pleso, High Tatras, Slovakia



17th International Conference

April 25-26, 2006

Bratislava, Slovakia

(E) Serbia Area

In the last Report a brief overview of the Research conducted out in Speech Communication in Serbia has been provided. These and some new activities seem to be further developed this year. This Report presents review of actual research activities.

Basic research on Speech and Language

Many projects in the field of basic research of Serbian language are under Grant of Ministry of Science and two projects directly related to Speech Communication are enumerated:

1. „Theoretical and Methodological Framework for the Modernization of the Description of the Serbian Language“, 2006-2010: Project supported by the Ministry of Science of Serbia (id=148021). Team from Faculty of Philology, Belgrade University.

The compilation of a series of collective monographs and the publication of a number of papers in prominent international journals, which are expected to provide in the above language fields: a critical overview of the current state of conceptualization, terminology, methodology and language resources; solutions which enable the modernization and better systematization of the conceptual apparatus, standardization of terminology, introduction of new methods in linguistic research and the development of language resources which meet contemporary standards. The project will also involve the creation of up-to-date IT resources which will make it possible to approach language research in a modern way.

2. „Interdisciplinary investigations of speech and language resource in Serbian language“, 2006-2010: Project supported by the Ministry of Science of Serbia (id=148028). Several Faculties and Institutions joined.

The frame of project is next six under projects: research of notification fields of prosodic features in speech communicative signal of Serbian, synthesis of prosodic features from the text and importance of prosody in perception and understanding of speech; research of statistical features of Serbian language and speech, statistical modeling of language and research of speech and language based on the needs of telecommunication technology; research linguistically and paralinguistic variations in normal, emotional speech expression in the field of inter and intra-speaker differences; research of verbal communication qualities in different ambient and communicative situation and speech perception models at persons with pathology in verbal communication; bioinformation research of prenatal sound and speech perception; cognitive research of conscious, thinking and language, and new contextual, cognitive and psychosomatic techniques.

Research and Development (R&D) in Speech and Language Technologies

Few significant projects in the field of technology development are finished or are under development.

1. „Human-Machine Speech Communication“, 2008-2010: Project supported by the Ministry of Science of Serbia (id=TR11001). Experts from several institutions joined the R&D team of engineers at Faculty of Technical Sciences – University of Novi Sad.

Based on the results in R&D of speech technologies for Serbian, that have been achieved at the project id=TR6144, the aims of this project are to continue improving synthesized speech quality (TTS) and robustness and accuracy of automatic speech recognition (ASR). R&D will include both speaker and emotion recognition toward better human-machine speech communication. Finally, the goal is to expand development potentials and resources for development and application of speech technologies in Serbian and development of innovative products based on modern speech and communications technologies.

2. „E-medicine system for evaluation of hearing quality“, 2008-2010: Project supported by the Ministry of Science of Serbia (id=TR13011). Innovative Life Activities Advancement Center, originated from Institute for Experimental Phonetics and Speech Pathology, Faculty for Special Education and Rehabilitation, and School of Electrical Engineering, Belgrade.

Research and development of Internet and HINT (Hearing In Noise Test) based system for evaluation of hearing quality.

3. „Intelligent Telephone E-Mail Access – iTEMA“, 2006-2009: EUREKA project (id=3864) supported by the Ministry of Science of Serbia. Partners: Faculty of Technical Sciences – University of Novi Sad and AlfaNum (Serbia), as well as Alpineon (Slovenia).

Listening e-mails via telephone are enabled using speech technologies and a user-friendly menu. The multilingual nature of iTEMA is characterised by language recognition at the sentence level and activation of a TTS synthesis engine in the recognised language. The iTEMA system will support reading e-mail messages in several widely used European languages by using off-the-shelf speech synthesizers, as well as most South Slavic languages such as Slovenian, Serbian, Croatian, and Macedonian by applying speech synthesisers developed by the iTEMA team.

4. „Development of speech technologies in Serbian and their application in Telekom“,

2005-2008: Project supported by the Ministry of Science of Serbia (id=TR6144A) and Telecommunications Company „Telekom Srbija”, Belgrade. Team from Faculty of Technical Sciences – University of Novi Sad and its spin-off company AlfaNum.

R&D activities have improved ASR and TTS for Serbian, launched their first applications in “Telekom Srbija”, as well as produced new resources for Serbian, Croatian and Macedonian.

5. “System for objective evaluation of articulation quality”, 2005-2007: Project supported by the Ministry of Science of Serbia (id=TR6134). Institute for Experimental Phonetics and Speech Pathology, and Faculty of Defectology, and School of Electrical Engineering, Belgrade.

Developed system for articulation quality evaluation is stand alone system based on specific speech therapy tests optimized for Serbian (it can be optimized for other languages).

Speech Technology applications

Applications in ex-YU are mostly developed within AlfaNum (FTS-UNS spin-off).

• „anReader“ – a TTS system with SAPI5 interface configured as an aid for the blind computer users, launched in 2005, and is now in wide use in all WB countries.

• „Audio Library for the Blind“ – a client-server application used for Internet access to the libraries, enabling users to select and listen to books easily, applied from 2006.

• „Contact“ – a voice portal for the visually impaired that provides access to news and other content of interest via phone in a verbal human-machine dialogue, launched in 2007.

• „Sastanak“ – a popular ASR-based phone dating system enabling users to leave their own profiles and search for matching profiles of other users, launched in 2006.

• „Advertising Monitor“ – the best technological innovation in Serbia in 2006, aimed at audio information retrieval from radio and TV broadcast recordings.

• „iTEMA“ – a commercial CTI application based on multilingual TTS developed for personalized e-mail access via phone. Free access will be provided for the blind.

SLT resources

➢ A SpeechDat compliant speech database in Serbian language has been produced for ASR development and training. It can be used in ASR training for other South Slavic languages because of their strong phonetical similarities. About 10 million people speak Serbian and South Slavic languages cover nearly 30 million people altogether.

➢ TTS speech databases have been recorded by a Serbian and a Croatian speaker and semi-automatically processed for the TTS engines. Those two languages are very similar, as well as languages spoken in Bosnia and Herzegovina and Montenegro.

➢ Serbian emotional speech database (GEES) has been produced for investigation of variations in speech production and application in TTS systems and forensic science.

➢ Accentuation-Morphological Dictionary has been produced for Serbian and Croatian language, containing about 3 millions inflected words each. A similar resource was produced for Macedonian, but at a smaller scale.

➢ A 170.000 word tagged text corpus in Serbian language has been produced. Words are tagged for their morphological categories as well as accentuation, enabling the use of the corpus in speech applications as well. An extremely large quantity of untagged text is always available over the Internet from newspaper archives.

➢ Software resources are developed for SLT database processing (labeling, POS and accentuation tagging...). Open source segments of such resources may be found on , as well as more information about the AlfaNum project.

Conferences and lecturer program

Here are some of national and international conferences and workshops in last two years:

➢ 52. Annual national conference ETRAN (Electronics, Telecommunications, Computers, Automation, Nuclear Engineering); organized by several Faculties and Institutions with IEEE section for Serbian and Montenegro; largest national conference in field of electrical engineering with special sessions on speech processing, Palic, Serbia, Jun 2008.

➢ 16. Telecommunication Forum TELFOR 2008, national and international regional Conference; organized by several national institutions with IEEE S&M Section and IEEE S&M COM Chapter, takes place in November 2008, Belgrade, Serbia. Among other communication fields cover review of speech R&D activities.

➢ Biennial national conference DOGS (Digital Speech and Image Processing); organized by several Faculties and Institutions, offers an overview of current state and research directions in digital speech and image processing. The 7th conference takes place in Palic, Serbia, October 2008.

➢ International Congress of Applied Linguistics APPLIED LINGUISTICS TODAY: Between Theory and Practice; organized by Association of Applied Linguistics of Serbia and Faculty of Philosophy, University of Novi Sad, to be held from 31 October to 1 November 2009.

➢ First European Congress on Prevention, Detection and Diagnostics of Verbal Communication Disorders, Patra, Greece, December 2007; organized by PALO (Greece) and IEFPG (Serbia).

➢ II European Congress of Early Prevention in Children with Verbal Communication Disorders, August 2008, Sofia, Bulgaria; organized by NBU (Bulgaria), PALO (Greece) and IEFPG (Serbia).

➢ Biennial international meeting 9th Symposium on Neural Network Applications in Electrical Engineering NEUREL 2008; organized by several Faculties and Institutions with IEEE section for Serbian and Montenegro, September, Belgrade, Serbia.

➢ A distinguished lecturer program is under development at the University of Belgrade - School of Electrical Engineering, and at the University of Novi Sad – Faculty of Technical Sciences and Laboratory for Acoustics and Speech Technologies

(F) Bulgaria Area

About Bulgaria

Bulgaria (България: in Bulgarian Cyrillic Alphabet) is a state in Southeastern Europe, borders on five other countries: Romania to the north (mostly along the River Danube), Serbia and the Republic of Macedonia to the west, and Greece and Turkey to the south. The Black Sea defines the extent of the country to the east. Bulgaria is a member of the European Union since 2007 and of NATO since 2004, it has a population of approximately 7.7 million, with Sofia as its capital.

Bulgaria's population consists mainly of ethnic Bulgarian (83.9%), with two sizable minorities, Turks (9.4%) and Roma (4.7%). Of the remaining 2.0%, 0.9% comprises some 40 smaller minorities, most prominently in numbers the Russians, Armenians, Vlachs, Jews, Crimean Tatars and Sarakatsani (historically known also as Karakachans). 96.3% of the population speak Bulgarian as their mother tongue. Bulgarian, a member of the Slavic language group, remains the only official language, but numbers of speakers of other languages (such as Turkish and Romany) correspond closely to ethnic proportions. The Bulgarian Alphabet is Cyrillic and so since 2007 Cyrillic became the third official Alphabet of the European Union.

The main research groups/institutions

|Institution |Domain of Interest |

|Bulgarian Language Department at the Sofia University St. Kliment |phonetics; computational phonology |

|Ohridski | |

| | |

|Laboratory of Applied Linguistics at the Konstantin Preslavsky |phonetics, speech production and perception; speech prosody |

|University of Shumen |analysis; speech recognition; speaker identification. |

| | |

|Department of Computer Sciences at the Plovdiv University “Paisii |computer informatics; natural language processing for educational |

|Hilendarski” |resources; text-to-speech. |

| | |

|Department for Computer Modeling of the Bulgarian Language at the |natural language processing; text-to-speech. |

|Institute for Bulgarian Language at the Bulgarian Academy of | |

|Sciences | |

|Linguistic Modelling Department at the Central Laboratory for |machine translation; knowledge based natural language processing; |

|Parallel Processing at the Bulgarian Academy of Sciences |compilation and standartization of linguistic resources. |

| | |

|Department "Pattern Recognition and Biometrics" at the Institute for|segmentation and recognition of half-tone and color images, text and|

|Information Technologies at the Bulgarian Academy of Sciences |graphics processing; 2-D and 3-D objects recognition and scene |

| |analysis; speaker recognition; neural networks and systems; |

| |biometric based identification; content based image/object retrieval|

| |(CBIR/CBOR). |

International Conferences/Workshops

|Event |Workshops |

|International Conference RANLP – 2007 /Recent Advances in Natural |A Common Natural Language Processing Paradigm For Balkan Languages |

|Language Processing/ September 27-29, 2007, Borovets, Bulgaria |Computational phonology |

| |Natural Language Processing and Knowledge Representation for |

| |eLearning Environments |

| |NLP for Educational Resources |

|International Conference "Cognitive Modeling in Linguistics-2005", |Speech Perception and Production |

|September, Varna, Bulgaria |Language Processing, Memory and Thought |

| | |

| | |

|International Conference "Cognitive Modeling in Linguistics-2007", |Speech Perception and Production |

|July-August, Sofia, Bulgaria |Language Processing, Memory and Thought |

| | |

| | |

(G) Poland Area

Research projects granted by National Program for R&D


1, Popowski K., Szpilewski E. „New Language-Dependent Prosody Processor Module for Polish Text-To-Speech Synthesis Application”, SPECOM 2007, 12th International Conference Speech and Computer, Proceedings, 15 – 19 October, 2007, Moscow, Russia, pp. 609 – 614.

2, Popowski K., Szpilewski E. "Voiced/Unvoiced speech classification using Back-Propagation Neural Networks". 14th International Conference Cybernetics and Systems of WOSC, Proceedings, 09 – 12 September 2008 Wroclaw. Poland.

3. Rafalko J. 'The Choice of Acoustic Units In Polish Voice Synthesis”, SPECOM’2007, The XII International Conference Speech and Computer, 15 – 18 October 2007, Moscow, Russia, pp. 585 – 589.


