„Working together for a green, competitive and inclusive Europe”

Project NORDTRANS

The main goal of the project is to improve the state-of-the-art quality and usability of the automatic speech recognition (ASR) technology for Swedish and Norwegian. In contrast to existing solutions, the newly developed technology will operate with high accuracy in all various applications including online broadcast monitoring, such as TV, internet podcasts, etc., transcription of speeches in parliaments and similar public institutions, as well as spoken archive mining. The incorporation of the developed technology into the existing multilingual speech processing solutions of NEWTON Technologies will open new market and cooperation opportunities for Newton in Northern Europe and, at the same time, bring new or better services based on ASR closer to people living in this part of Europe.

Project duration: 1. 1. 2021 – 30. 4. 2024

Partner Organizations

Project Providers

The project „NORDTRANS – Technology for automatic speech transcription in selected Nordic languages“ benefits from
a € 1244 000 grant from Norway and Technology Agency of the Czech Republic.

News

NORDTRANS Press conference
NORDTRANS Press conference

24 Nov 2022

NEWTON Technologies held a press conference on 24 November 2022 to introduce the project and the development of speech technologies for Nordic languages, including a demonstration of its use in practice in the Beey application.

The meeting took place at NEWTON Technologies offices in Prague. Lenka Weingartová, Principal Investigator of the project, presented on behalf of the organizers. Furthermore, Petr Červa and Jan Nouza from the Technical University of Liberec, and especially our guests from NTNU – Janine Rugayan and Torbjørn Svendsen, presented their research activities. The entire presentation was in English and the recording is available here.

Learn more

Presentation of the NORDTRANS project
Presenting the NORDTRANS project as an example of good practice

20 Sep 2022

On 20 September, NEWTON Technologies attended a meeting about our implementation of the EEA and Norway Grants. Lenka Weingartová introduced the NORDTRANS project, in which along with the Technical University of Liberec we are developing technology for automatic speech-to-text transcription for Norwegian (and eventually Swedish).

We are proud to have been selected from twenty other projects in the Research category to present the Norwegian recognition module as an example of good practice in the use of Norway Grants. In addition, we were also pleased that the Ambassador of the Kingdom of Norway and other representatives of the Norwegian Embassy in Prague attended the meeting and saw a live demonstration of Norwegian in the Beey app.

Try Beey out

Scandinavian language identification at TSD
Scandinavian language identification at the Text, Speech and Dialogue 2021 Conference

10 Sep 2021

The team of Petr Červa and Jan Nouza presented a paper called “Identification of Scandinavian Languages from Speech Using Bottleneck Features and X-vectors” at the Text, Speech and Dialogue Conference in Olomouc.

The paper deals with identification of the three main Scandinavian languages (Swedish, Danish and Norwegian) from spoken data. The best resulting approaches take advantage of multilingual bottleneck features (BTNs) and allow us to identify the target languages in speech segments lasting only 5 seconds with a very low error rate around 1%. The identification offers the opportunity for many practical applications, such as in systems for transcription of Scandinavian TV and radio programs, where different persons speaking any of the target languages may occur.

Norwegian now available in Beey
Norwegian now available in Beey

11 Aug 2021

The first version of automatic Norwegian transcription is now available in Beey. Beey is a web-based platform designed for automatic transcription of recordings and its fast editing.

Norwegian thus becomes the first commercially available Nordic language in the portfolio of NEWTON Technologies.

The acoustic model is trained on 210 hours of recordings collected from various publicly available sources, such as the Norwegian TV NRK, Norwegian radio and transcripts of parliamentary proceedings.
The automatic transcription of Norwegian can handle both written variants, Bokmål as well as Nynorsk, even in the case when speakers switch between the two.

Try Beey out

Interview in the Meltingpot forum
Interview with Lenka Weingartová in the Meltingpot forum

28 May 2021

The principal investigator of the NordTrans project, linguist Lenka Weingartová, was a guest of the Meltingpot Forum. In an interview with moderator Vladimír Piskala, they talked about how to teach a computer to understand human speech and the projects NEWTON Technologies is currently involved in, including the NordTrans project, and the pitfalls of Scandinavian language recognition.

Contact information

Lenka Weingartová, Ph.D.
Principal Investigator, NEWTON Technologies
lenka.weingartova@newtontech.cz

Petr Červa, Ph.D.
Lead Researcher, TUL
petr.cerva@tul.cz

Prof. Giampiero Salvi, PhD
Lead Researcher, NTNU
giampiero.salvi@ntnu.no