Frequently asked questions



Questions and answers Contact form
FAQ

Voice technologies


The most frequently asked questions about voice technologies.

What are neural networks and how are they useful in voice technologies?

An artificial neural network is a computational model used in the field of machine learning (artificial intelligence). This computational model currently yields the best results in a variety of artificial intelligence applications (object recognition within an image, speech recognition, translation from one language to another…). The structure of this computational model is inspired by the structure of biological neural networks, e.g. in the brain. Accordingly, an artificial neural network consists of a large number of small computational units (neurons) which are interconnected in series and parallel circuits. Much like the brain, this network is also capable of learning – in the learning process, the weight of the connections between the individual neurons are adjusted in the artificial neural network. Any given neuron in the network then sends a signal to the subsequent layers only if the sum of its input signals, multiplied by the learned weighting factors, exceeds a certain threshold value (this is similar to the workings of neurons in the human brain).

What are the advantages of using neural networks in speech recognition, compared to the previous system?

The system of machine learning using neural networks offers a significantly higher accuracy in speech recognition. This is especially apparent under difficult conditions, e.g. when transcribing a compressed recording, with excessive background noise, when the sound has been recorded from a greater distance, etc. In cases such as these, the neural network is much more robust and the quality of speech recognition is retained more successfully than with the previous version.

Can your system for speech recognition learn on its own?

From the point of view of machine learning theory, one of the advantages of neural networks is that if they are deep enough, the network is capable of essentially creating some of its own internal abstract indexes between layer which are far better than what man can produce from a processed signal through various sophisticated transformations and algorithms. However, even the previous algorithms had to be learned.  This is not a novelty for neural networks. But it is important to remember that no system can learn entirely on its own. There always has to be a teacher for learning to be able to take place.

How significant are the improvements brought by neural networks and where do they manifest?

Under ideal conditions, where even the previous system worked well, we can expect a relative decrease in the error rate by 10% – 20%. This can increase the accuracy from 90% up to 91% even 92%. Under difficult conditions, where the previous system functioned e.g. with an accuracy of only 40% – 60%, we can now expect significantly better results, e.g. around 80%.

What does the process of “learning” in neural networks look like and how long does it take?

Learning with the aid of several hundred hours’ worth of voice recordings takes approximately 24 hours using one powerful graphics card.

FAQ

NEWTON Dictate


Everything you need to know about the program for the automatic recognition of dictated speech.

What is the difference between the NEWTON Dictate program and the service for transcribing recordings?

NEWTON Dictate is most appreciated by those who want to take down notes, make log entries or dictate a text previously written by hand. In contrast, the the service for transcribing recordings is best suited for recognizing previously recorded audio files (such as recordings of interviews, meetings etc.). Recordings are best transcribed using our program NEWTON SpeechGrid.

What can I dictate using the program NEWTON Dictate?

NEWTON Dictate is designed for dictating general texts in standard language. It is available in Czech, Slovak, Polish and Croatian. To transcribe spontaneous speech or to dictate professional texts, it is necessary to use the corresponding general or specialized dictionary.

What is the minimum recommended computer configuration needed in order for the program to work properly?

The program requires a computer with a processor of min. Intel Core i5 (1.7GHz and up), 4GB RAM.

Supported OS: Microsoft Windows 10, 8 and 7 32-bit or 64-bit. Installation: Microsoft .NET 4 (is included in the package or available for download at http://www.microsoft.com/net/). Sufficient space on the hard drive is needed (up to 600MB for the general dictionary). A standard sound card supporting a sampling rate of 16kHz, with a 16-bit resolution. The program will also work on computers with lower performance, but in that case there will be a delay in the recognition process.

Can I use any microphone for dictating?

For dictation, it is advisable to use a so-called directional microphone, which, unlike the computer’s internal microphone, will only capture sounds in its immediate vicinity. A high-quality microphone is also included in the NEWTON Dictate package.

What will the program write if I dictate a word which is not in the NEWTON Dictate dictionary?

The application always tries to recognize the entire dictation. Therefore, unknown words are not left as blank spaces, but are replaced with what is judged to be the phonetically most similar variant. If you need to dictate an unknown word repeatedly, you can add it to the user dictionary. The application will learn the word and recognize it in the next dictate.

Which formats can I use for saving the resulting text?

The recognized text can be saved in the standard RTF or TXT format. The application also retains the audio recording of your dictation, which you can then export in MP3, WAV or SPX formats. If you want to continue working with the text and sound recording in NEWTON Dictate, the program allows you to save the entire document in TTAX format.

What if I need my dictated text to be written directly into another program?

If you want NEWTON Dictate to transcribe your dictation directly into another program, you can use the “MINI” feature, which writes the dictated text into the current location of the mouse cursor. This allows you to dictate into any application, information system or Internet browser.

What should I do if the program does not understand me?

If the program has trouble recognizing your speech, first check whether your microphone is selected in the settings and correctly positioned in front of your mouth. The introductory tutorial to the program will take you through the microphone settings step by step. An incorrectly set microphone is the most frequent cause of problems with the program’s functionality.

Can I dictate if I have a minor speech defect?

Yes, the program will automatically adapt to the voice of every new user, and is capable of eliminating the effects of minor speech defects such as the inability to correctly pronounce the sound.

FAQ

NEWTON SpeechGrid


The most frequently asked questions regarding solutions for the transcription of recordings

How can I try out the SpeechGrid technology?
NEWTON SpeechGrid can be easily tried out thanks to the NTeX program. The program can be downloaded here.
How is the NTeX program controlled?
The user manual can be found here: NTeX User Manual
Didn’t find an answer to your question? Do you need advice regarding the selection or settings of our products? Leave us a message or call us on (+420) 225 540 120.
Jan Papoušek, zákaznická podpora
Jan Papoušek
Customer support
We received your message and will contact you back soon.

Leave us a message or call us. We will reply to your queries as soon as possible.


Na Pankráci 1683/127,
140 00 Praha 4
Czech Republic

E: podpora@newtontech.cz
P: (+420) 225 540 120