How Skype Translator works
Table of contents:
- The technology that makes it possible
- From one spoken language to another in a few seconds
- The test program as a starting point
Science fiction is full of references to technologically advanced devices whose operation, to paraphrase the mythical expression, is indistinguishable from magic. Arising from the creative mind of their authors, it is hard to imagine when such inventions could be in our hands and we end up accepting that their existence will not become part of our life cycle. But every once in a while one of them sneaks into our lives prematurely. That is the case of the real-time translation that Microsoft and Skype are about to make possible
"The task is anything but simple.It involves Skype&39;s ability to video conference, Microsoft Azure&39;s vast network of cloud servers, Microsoft Research&39;s technological innovations, and recent advances in multiple areas such as statistics and machine learning. All this put at your service so that, as soon as you pronounce a sentence in your language, the system recognizes what you say, translates it and transmits it to your contact in a different language. How is it possible?"
The technology that makes it possible
Skype Translator, the name by which the new functionality is known, is not a flash in the pan, not even in a year . Skype Translator is the result of decades of research in speech recognition, machine translation, and machine learning techniques. In all these areas rests the operation of a system that would not have been possible without the latest advances in them.
Skype Translator is the result of decades of research in speech recognition, machine translation and machine learning techniques.Starting with Speech Recognition, a technology that has been under investigation for some time but whose adoption has always been affected by the large number of errors and excessive sensitivity of existing systems. A second of doubt, small variations in the accent, or a minimum noise was enough to confuse the computer and make it understand what it wanted. That's how it was until the development of 'deep learning' techniques and the creation of artificial neural networks exploded, of which Microsoft Research knows something. Thanks to them, it has been possible to considerably reduce the error rate and improve the reliability and robustness of speech recognition, a necessary first step for Skype Translator to work.
machine translation is the other obvious pillar on which Skype Translator rests. Here Microsoft once again uses in-house technology and uses the Bing translation engine to translate text from one language to another.His system uses a combination of syntax recognition techniques and statistical models to refine the result. In addition, on this occasion, the engine has been specially trained to recognize the type of language that occurs in spoken conversations, far from the correctness and neatness that is usually assumed in writing. Thus, the Skype Translator system combines the large language knowledge base of Bing Translator together with an extensive layer of words and phrases that are commonly used in colloquial language.
But speech and languages are complicated terrain. They constantly change, they come in multiple flavors and varieties, each person has their own particular style, etc. Skype Translator has to keep up with all this, requiring constant training and optimization of both speech recognition and machine translation. To do this the system has been built on a robust 'machine learning' platform, a branch of artificial intelligence that aims to develop techniques that allow machines and algorithms to learn by training with sample data.The use of these techniques, common in the area of statistics, allow the service to improve as it is used, taking advantage of the data generated when using it to further refine speech recognition and automatic translation.
Some of this test data is generated automatically from a wide variety of sources, including social networks such as Facebook, translated web pages, videos with sub titles, or even conversations created on purpose and transcribed and translated manually. But another part of the data comes from actual conversations held through the service. This is important because, as Microsoft notifies you with each call, you should know that Skype Translator can record conversations, keeping them anonymous, so that they can be later analyzed by its algorithmsand introduced in the training process of their statistical models.
Skype Translator can only function properly if it is able to learn through a process based on its use in real human conversations
"The system could not function without this learning process. As humans speak we pause and repeat things, make mistakes and change our thinking as we go, introducing ahs, ehms, uhms>only learning about its actual use can make it better "
From one spoken language to another in a few seconds
Supported by all these advances, the key is that Skype Translator is able to perform the entire recognition and translation process quickly and transparently for the userEvery time we speak, the system must recognize what we are saying, translate it into the recipient's language and communicate it to him in a way that remains faithful to what we were initially trying to communicate.The less we notice the intermediate steps the better.
As soon as the system detects that we are speaking it begins to record what we say and starts the speech recognition process This is not about not only to recognize each word that we are pronouncing, but also to eliminate everything superfluous, deleting meaningless expressions and noise, detecting the division of the text into sentences, with the inclusion of punctuation marks and capital letters, and providing it with a context that helps to your interpretation. When you think about it a bit, you realize how difficult it is to determine all this from spoken language.
Skype Translator needs that speech recognition to be as accurate as possible, because what follows is preparing the collected information to compare it with the statistical models that have been improvingthrough its 'machine learning' system.Here the process consists of finding similarities between what the system has understood that we were saying and the words and contexts contained in the models, to subsequently apply previously learned transformations that will convert the audio into text and translate it into the foreign language.
In the final step, Skype has prepared a pair of bots, with female and male voices, that act as interpreters in the call Once one is selected by the user, he will be in charge of communicating our translated message to the receiver, so that not only the written transcriptions and translations appear on the screen, but he can also hear them out loud as if a third human were intermediating between us. . These bots are able to quickly communicate the message, so that whoever is listening on the other side of the screen receives the message a few seconds after we have pronounced it.
The test program as a starting point
Precisely the presence of bots as third-party speakers in the conversation is one of the details that still needs to be polished. Microsoft acknowledges that adapting to them is easy for people used to speaking through an interpreter, but for others it requires a learning period. And it is that Microsoft and Skype may be determined to create the best real-time translation experience that exists, but to do so they need us to learn both ourselves and the machines The Skype Translator preview is just one more step in that process.
The test program went live in mid-December, introducing spoken translation between two languages: English and Spanish, and written translation in more than 40To access it, an invitation is necessary, which we can request by registering on the program's website. If we are graced with it we can try Skype Translator from the Skype applications for Windows 8.1 or Windows 10 Technical Preview. Otherwise we will have to wait for the service to be extended and made public officially.
"Anyway, Skype Translator has kicked off just as we are about to say goodbye to 2014. Before finishing, stop here for a second and think about the year you just read: two thousand fourteen>"
Via | Skype Blogs I, II