„Working together for a green, competitive and inclusive Europe”

Project NORDTRANS

The main goal of the project is to improve the state-of-the-art quality and usability of the automatic speech recognition (ASR) technology for Swedish and Norwegian. In contrast to existing solutions, the newly developed technology will operate with high accuracy in all various applications including online broadcast monitoring, such as TV, internet podcasts, etc., transcription of speeches in parliaments and similar public institutions, as well as spoken archive mining. The incorporation of the developed technology into the existing multilingual speech processing solutions of NEWTON Technologies will open new market and cooperation opportunities for Newton in Northern Europe and, at the same time, bring new or better services based on ASR closer to people living in this part of Europe.

Project duration: 1. 1. 2021 – 30. 4. 2024

Partner Organizations

Project Providers

The project „NORDTRANS – Technology for automatic speech transcription in selected Nordic languages“ benefits from
a € 1244 000 grant from Norway and Technology Agency of the Czech Republic.

News

Scandinavian language identification at TSD
Scandinavian language identification at the Text, Speech and Dialogue 2021 Conference

10 Sep 2021

The team of Petr Červa and Jan Nouza presented a paper called “Identification of Scandinavian Languages from Speech Using Bottleneck Features and X-vectors” at the Text, Speech and Dialogue Conference in Olomouc.

The paper deals with identification of the three main Scandinavian languages (Swedish, Danish and Norwegian) from spoken data. The best resulting approaches take advantage of multilingual bottleneck features (BTNs) and allow us to identify the target languages in speech segments lasting only 5 seconds with a very low error rate around 1%. The identification offers the opportunity for many practical applications, such as in systems for transcription of Scandinavian TV and radio programs, where different persons speaking any of the target languages may occur.

Norwegian now available in Beey
Norwegian now available in Beey

11 Aug 2021

The first version of automatic Norwegian transcription is now available in Beey. Beey is a web-based platform designed for automatic transcription of recordings and its fast editing. Norwegian thus becomes the first commercially available Nordic language in the portfolio of NEWTON Technologies. The acoustic model is trained on 210 hours of recordings collected from various publicly available sources, such as the Norwegian TV NRK, Norwegian radio and transcripts of parliamentary proceedings. The automatic transcription of Norwegian can handle both written variants, Bokmål as well as Nynorsk, even in the case when speakers switch between the two.

Try Beey out

Interview in the Meltingpot forum
Interview with Lenka Weingartová in the Meltingpot forum

28 May 2021

The principal investigator of the NordTrans project, linguist Lenka Weingartová, was a guest of the Meltingpot Forum. In an interview with moderator Vladimír Piskala, they talked about how to teach a computer to understand human speech and the projects NEWTON Technologies is currently involved in, including the NordTrans project, and the pitfalls of Scandinavian language recognition.

Contact information

Lenka Weingartová, Ph.D.
Principal Investigator, NEWTON Technologies
lenka.weingartova@newtontech.cz

Petr Červa, Ph.D.
Lead Researcher, TUL
petr.cerva@tul.cz

Prof. Giampiero Salvi, PhD
Lead Researcher, NTNU
giampiero.salvi@ntnu.no