Enhancing In-Video Discovery with Intelligent Visual Text Recognition

Written by VIDIZMO Team | June 17,2019

Enterprises are continuously searching for solutions that allow them to increase their workforce productivity, reduce manual labor and enable automation for their business processes. With digital content increasingly becoming a huge asset in organizations, the need for reducing human involvement from different workflows of content management is highly demanded. This is a field where visual text recognition – more commonly known as Optical Character Recognition (OCR) – plays a major role in automating the adoption of digital assets for enterprises.

OCR is a technology that scans the physical or digital files and converts the occurring visual text (printed or handwritten) into digitized format. Over time, OCR has exponentially made its way into many organizational processes for digitizing the textual information found in images and documents. These processes include performing data entry by scanning the paperwork, automatically populating database with hundreds of applicant forms, maintaining scribbled meeting notes digitally, extracting license plate numbers from traffic cam images, amongst others.

Extraction of visual text from videos is a new and enhanced concept as it is does not only involve running OCR on the video frames (images), rather application of intelligent algorithms to extract the textual information and not losing the context. As an example, let’s consider a video recording of a conference where the speaker is constantly moving back and forth on the stage, partially obscuring parts of the conference’s title that is displayed at the background (see images below).

This causes the OCR results to be something like:

web
Summ|
web summ'
it

Thus, applying plain OCR on videos for useful text extraction is not enough and requires intelligent processing that discards any gibberish or repeated text extracted, amalgamates separate strings of letters to produce relevant sentences and provides improved video OCR results. Therefore, performing an intelligent OCR would give the result:

web
summit
web summit

With such an advancement in video technology, VIDIZMO is offering its users the ability to intelligently extract textual information from their uploaded videos. All the text identified from videos and other media assets is indexed, auto-tagged and made searchable, allowing you to search your content with words that were hidden and inaccessible otherwise.

Words from different languages can be identified and extracted by using VIDIZMO OCR. These insights are then used with VIDIZMO's other AI capabilities such as translation, allowing you to access and understand text that appears in your content in different foreign languages. This increases your content accessibility globally, allowing VIDIZMO’s customers in different industries to take video OCR to the next level. Now let's focus on how OCR is revolutionizing the use of videos in these industries.

There is an increasing siege of video content in law enforcement and surveillance and manual analysis of such videos is becoming more and more cumbersome. With intelligent video OCR, these agencies can automatically extract and index useful information such as vehicle identification by license plate reading, identifying street signs, validating parking by analyzing no-parking signs and much more.

In the field of training and learning, intelligent OCR can be applied on video content to extract lecture notes scribbled on the board or appearing in a presentation slide. In healthcare, patient medical documents are scanned for OCR and a digital database is automatically maintained to provide a centralized repository of patient health records.

In games and sports, facial detection might be a more difficult process as the players move around the field a lot, and so OCR is used to recognize the numbers on players’ shirts and thereby making the process of identification a lot smoother. Moreover, visual text recognition in these different industries is becoming an assistive technology where the extracted text allows the visually impaired individuals to interact with the enterprise application with ease.

All the textual information as identified by intelligent video OCR is highlighted against the video timeline, allowing users to navigate to the point where a specific word appears, thereby making in-video search much easier. With visual text recognition in videos as well as images and documents, VIDIZMO is automating content management, discovery and accessibility as well as offering a quick solution that is not cost or resource intensive.

To know more about the different features offered by VIDIZMO visual analysis features, find helpful links below:

Understanding Video Insights
Introducing Artificial Intelligence to Power Your Video Content

Or check out the press release announcing the latest AI features in VIDIZMO:

VIDIZMO’s Artificial Intelligence bridges the gap between unstructured and structured video data like never before

View full post