How does it work? | Speech synthesis

Date:

2017-07-13 17:30:05

Views:

1403

Rating:

1Like 0Dislike

Share:

How does it work? | Speech synthesis

We talked about speech recognition, today we will discuss the inverse problem. So how does speech synthesis, or, in other words, converting arbitrary text to voice — this was in today's issue!

Http://www.youtube.com/watch?v=a_OeS-ORWQQ

The Task of speech synthesis is solved in several stages. First of all, a special algorithm is necessary to prepare the text to robot be comfortable to read: it records all the number words and decode abbreviations. Then the text is broken down into individual phrases that need to be read with continuous tone — for this system focuses on the punctuation and sustainable design.

Next, all words are phonetic transcription. To understand how to read the word and where to put it the accent, the system accesses the built-in, written by the dictionary. If the desired word is absent, the computer builds the transcription of their own, based on academic rules. If they are insufficient, in the case involving statistical rules: the system iterates through the records of the speakers and determines what style they did the emphasis.

When the transcription is made, the computer calculates how many frames, or, in other words, fragments with a length of 25 milliseconds. Next, each frame is described by many parameters: part of which phoneme it is, what place it occupies in a syllable that include this phoneme. It also describes the French or bezdarnosti phoneme, if it is a vowel. In addition, the system creates the correct intonation using phrase and sentence.

The system Then uses the acoustic model to read the prepared text. It establishes the correspondence between the phonemes with certain characteristics and sounds. Acoustic model knows how to correctly pronounce the phoneme and to give the correct intonation of the sentence through machine learning. The more data on which the model learns, the better she issued the result.

As for the votes, makes them recognizable in the first place, the tone depends on the characteristics of the structure of the organs of the vocal apparatus. The timbre of any voice can be simulated, that is, to describe its characteristics — it is enough to record in the Studio a small amount of text. From then on, the tone can be used in the synthesis of speech in any language. When the system needs to say something, it uses a generator of sound waves — the vocoder. Displays information about the frequency characteristics of the phrase, obtained from the acoustic model, as well as data on the voice which gives voice recognizable color.

It is Worth noting that the modern technology of speech synthesis have some problems. The first of these is the artificiality. Any synthesized speech is perceived by a person with difficulty, and he is forced to use additional resources to understand it. Thus, people can normally perceive synthesized speech only about 20 minutes. Also synthesized speech, as a rule, no emotional coloring, and it has low noise immunity. In other words, the perception of synthesized speech interfere with any person, even the small noises.

Recommended

An air leak site has been found on the ISS. What's next?

An air leak site has been found on the ISS. What's next?

Air leak occurs in Russian station module Inside the International Space Station live astronauts from different countries and all of them need oxygen. The air needed for the life of the crew is produced by special equipment, but the tightness of the ...

Why can thinking about death make life happier?

Why can thinking about death make life happier?

Awareness of one's own mortality can be a liberating and awakening experience How do you feel about the idea of death? How often do you think about it and what emotions do you feel? Many of us have been pondering these questions lately. The pandemic ...

A new photo of Jupiter has found a new spot. What's it?

A new photo of Jupiter has found a new spot. What's it?

New photo of Jupiter taken by the Hubble Telescope Jupiter is considered the largest planet in the solar system. It mainly consists of a huge amount of hydrogen and helium, so it has a much lower density than many other planets. Most of all, Jupiter ...

Comments (0)

This article has no comment, be the first!

Add comment

Related News

How does it work? | Speech recognition

How does it work? | Speech recognition

the First device for speech recognition appeared in 1952, it was able to understand spoken human figures. 40 years later, the first commercial software for recognizing human speech. They were designed for people who, because of ph...

How does it work? | Iris scanner

How does it work? | Iris scanner

the Technology of scanning an iris of the eye was first proposed in 1936 by ophthalmologist Frank Bursh. He said that the iris of each person is unique. The probability of coincidence is about 10 to the minus 78 degrees, which is ...

How does it work? | Fingerprint scanner

How does it work? | Fingerprint scanner

identification of the fingerprint — one of the most reliable ways to confirm the identity of the person. On the accuracy of this method is second only to the retinal scan and DNA analysis. Fingerprint — it's nothing li...