News

Dropbox incorporates text search in pdf documents and images

Table of contents:

Anonim

For the second time in just two months, Dropbox has improved its search system so that it is now able to search for text within PDF documents and even image files such as PNG or JPG.

Dropbox: find what you want where you want

It seems that this is the premise of Dropbox, the popular cloud storage platform that in recent months has focused on improving its search system. Last month the company rolled out a new machine learning-based search engine and now announces it is improving optical character recognition (OCR) capabilities that allow users to search for text in both PDF and image files.

“Image formats (such as JPEG, PNG, or GIF) are generally not indexable because they have no text content, while text-based document formats (such as TXT, DOCX, or HTML) are generally indexable. PDF files are left in the middle as they can contain a mix of text and image content. Automatic text recognition of the image is able to intelligently distinguish between all these documents to categorize the data it contains.

Despite the good news, for the moment this new improvement is limited in two aspects. On the one hand, it seems to be limited to the English language:

So now, when a user performs an English text search that appears in one of these files, it will show up in the search results.

On the other hand, as Jon Porter collects in The Verge, the function is limited to the most expensive subscription levels.

The new feature is available now for Dropbox Business Advanced and Enterprise users, and should be available to professional Dropbox subscribers in the coming months.

The operation is similar to the technology already implemented in the Dropbox mobile app last year: using the app to photograph a document, but running OCR at the same time to extract the text. However, this only worked with a small subset of the documents.

By implementing OCR capabilities directly in the search engine, Dropbox is now able to search for text within all of your PDF files and images, no matter how they were scanned or photographed.

DropboxThe Verge Font

News

Editor's choice

Back to top button