How does it work? | Machine translation

For the first time the idea of using electronic computers for the translation of texts was expressed in 1947 in the United States, immediately after the appearance of the first computer. The first public demonstration of machine translation took place in 1954. That system was very primitive: it had a vocabulary of only 250 words, 6 grammar rules, and could translate a few simple phrases. But the experiment has received considerable attention: began research in countries around the world including in the Soviet Union. How does a modern machine translation system — this is in the news today!

At the heart of modern systems is based on the translation algorithm using a formal grammar, languages and statistics. To learn a language, the system compares thousands of parallel texts containing the same information but in different languages. For each of the studied text, the system builds a list of unique features. For example, rarely used words and special characters that appear in the text with certain frequency.

In machine translation systems, as a rule, three main parts: the translation model, language model and decoder. The translation model is a table where all words and phrases in one language are possible translations into another language with an indication of the likelihood of these transfers. The system compares not only single words but also phrases of multiple words, consecutive. Model translation for each pair of languages contains millions of pairs of words and phrases. As for language model, it is created by the system at the stage of studying texts.

Translated by the decoder. He performs morphological and syntactic analysis of the text and for each sentence picks up all the translations are sorted in descending order of probability. Then all the options the decoder estimates, using the model of language on the frequency of use and chooses the proposal with the best combination of probability and frequency.

Of the machine translation system can be used not only for texts but for translating single words. They contain full dictionaries with detailed cards of words and expressions. These cards system is based on statistical data, based on the rules of the language. For machine dictionary it selects only the dictionary form of words and expressions. The system carries out morphological and syntactic analysis, determines the part of speech, the lexical form of the word, and sets the boundaries of phrases. This information helps to weed out incomplete phrases. To avoid errors and typos, an algorithm based on machine learning technology that examines all potential translation pairs and eliminates unreliable.

Similar in value transfers are grouped in the system by using the dictionaries of synonyms. In them are words that are often translated into another language equally or form a phrase with the same words. The result is a machine dictionary gets everything he needs to know about each word and expression to its basic form, part of speech, meanings and synonyms. Some systems, for clarity, add to the translations of the examples are taken from parallel texts.

The use of statistics enables machine translation systems to change with language. If people start to write a word another way, the system sees it, as soon as it get new texts. To improve the quality of the translation system is updated regularly and carry out inspections. However, high-quality machine translation of texts remains elusive. However, it significantly facilitates and accelerates the work of translators.

How does it work? | Machine translation
Hi-News.ru