Bing

Microsoft Research creates a system capable of generating "smart" captions automatically

Table of contents:

Anonim

Surely you have come across a caption that is confusing, incorrect or says little about the image to which it refers; and it is even possible that, if you dedicate yourself to publishing your own articles, you find it most tedious to fill in this section. Well, the people from Redmond have created a tool that aims to make things easier for you.

A work published by Microsoft Research that describes itself as a “caption generation system” capable of mimicking the narrative characteristics of human language, that is, a technology that can describe screenshots as if about one of us, with its corresponding context.Something that companies like Facebook, Microsoft and Google have been working on for some time, but this time it exceeds expectations.

What does it consist of

He had a great time

In this way, the system has the ability to even tell a complete story from several images, describing it and telling it as if it were a book. A utility that, according to experts, could end up becoming a feature that provides a more human touch to certain applications, voice recognition applications, automatically generating descriptions in other areas and much more.

And the fact is that the tool is not limited to saying, briefly, what it "sees", but rather provides a broader context of the situation that is reflected in the image, achieving a "narrative context and unique style of narration", explained Frank Ferraro, one of the authors of this work.To put ourselves in a situation, he gives us a clear example

His mother was proud of him

Thus, we propose the following case: “Let's imagine we have a photo album of some friends who have celebrated a birthday in a Pub. Some of the early images show people ordering beer and drinking it, while the later ones show someone asleep on a sofa,” he says.

A conventional system “could simply point to something like there is a person lying on a sofa, while our system could include that she is probably in that situation because she is drunk after having a few drinks ”. An addition that provides understanding and a certain emotional charge that is also reflected through the images and photo captions included in this article.

Via | MIT Technology Review

In Xataka Windows | Microsoft launches an app that determines your dog's breed

Bing

Editor's choice

Back to top button