Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Optical Character Recognition: Fundamentals and Applications
Optical Character Recognition: Fundamentals and Applications
Optical Character Recognition: Fundamentals and Applications
Ebook99 pages1 hour

Optical Character Recognition: Fundamentals and Applications

Rating: 0 out of 5 stars

()

Read preview

About this ebook

What Is Optical Character Recognition


OCR, also known as optical character recognition, is the process of electronically or mechanically converting images of typed, handwritten, or printed text into machine-encoded text. This can be done from a scanned document, a photo of a document, a scene photo, or from subtitle text that is superimposed on an image.


How You Will Benefit


(I) Insights, and validations about the following topics:


Chapter 1: Optical character recognition


Chapter 2: Typeface


Chapter 3: Handwriting recognition


Chapter 4: Image scanner


Chapter 5: Optical mark recognition


Chapter 6: Computer font


Chapter 7: Intelligent character recognition


Chapter 8: Tesseract (software)


Chapter 9: Comparison of optical character recognition software


Chapter 10: OCR Systems


(II) Answering the public top questions about optical character recognition.


(III) Real world examples for the usage of optical character recognition in many fields.


(IV) 17 appendices to explain, briefly, 266 emerging technologies in each industry to have 360-degree full understanding of optical character recognition' technologies.


Who This Book Is For


Professionals, undergraduate and graduate students, enthusiasts, hobbyists, and those who want to go beyond basic knowledge or information for any kind of optical character recognition.

LanguageEnglish
Release dateJul 6, 2023
Optical Character Recognition: Fundamentals and Applications

Read more from Fouad Sabry

Related to Optical Character Recognition

Titles in the series (100)

View More

Related ebooks

Intelligence (AI) & Semantics For You

View More

Related articles

Reviews for Optical Character Recognition

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Optical Character Recognition - Fouad Sabry

    Chapter 1: Optical character recognition

    From a scanned document, a photo of the document, a scene-photo (such as the text on signs and billboards in a landscape photo), or subtitle text superimposed on an image, optical character recognition (OCR) is the electronic or mechanical conversion of images of typed, handwritten, or printed text into machine-encoded text (for example: from a television broadcast).

    It is a common method of digitizing printed texts for electronic editing, searching, compact storage, online display, and use in machine processes like cognitive computing, machine translation, (extracted) text-to-speech, and other suitable documentation, such as passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, and other suitable documentation. Pattern recognition, AI, and computer vision all contribute to OCR.

    Earlier versions only supported a single typeface and required training with photos of each character. Modern systems often handle many digital picture file formats and can provide a high level of identification accuracy for most typefaces. Some implementations may generate a copy of the page with all the formatting details preserved, such as graphics, columns, and other non-textual elements.

    Telegraphy and the development of reading aids for the blind are two possible antecedents of modern optical character recognition.

    For examining microfilm archives using an optical code recognition system, Emanuel Goldberg created what he dubbed a Statistical Machine in the 1920s and 1930s. In 1931, he received U.S. Patent 1,838,389 for his creation. IBM now owns the patent.

    After developing omni-font OCR, which could read text written in almost any typeface, Ray Kurzweil founded Kurzweil Computer Products, Inc. in 1974. (Kurzweil is often credited with inventing omni-font OCR, but it was in use by companies, including CompuScan, in the late 1960s and 1970s.) The optimum use for this technology, Kurzweil reasoned, would be to develop a reading machine for the blind, which would enable the visually impaired to have a computer read aloud whatever text the user inputs. Two key technologies, the CCD flatbed scanner and the text-to-speech synthesizer, had to be developed for this gadget to become a reality. The final product was introduced during a press conference on January 13, 1976, chaired by Kurzweil and the National Federation of the Blind. The first commercial version of the optical character recognition software was released by Kurzweil Computer Products in 1978. As one of the first adopters, LexisNexis purchased the software in order to include news articles and legal briefs into its fledgling online databases. Kurzweil sold his firm to Xerox two years later because of the latter's desire to commercialize text conversion from paper to computer. Scansoft, which had been spun off by Xerox, later merged with Nuance Communications.

    On the 2000s, OCR was made accessible in the cloud, on mobile devices, and in real-time translation of foreign-language signs using a smartphone via services like WebOCR. With the rise of internet-connected mobile devices like smartphones and smartwatches, OCR is being put to use in apps that pull text from images taken with the device's camera. If the device doesn't have OCR capabilities integrated into the OS, an OCR API will be used to read the picture file and extract the text. The OCR API sends the extracted text back to the program on the device, together with information on where in the original picture the text was recognized, so that the app may do something with it (like convert it to voice or display it).

    Latin, Cyrillic, Arabic, Hebrew, Indic, Bengali (Bangla), Devanagari, Tamil, Chinese, Japanese, and Korean characters are all supported by a wide variety of commercial and open source OCR systems.

    Receipt OCR, Invoice OCR, Check OCR, and Legal Billing Document OCR are just some of the numerous types of domain-specific OCR applications that have been built on top of OCR engines.

    You may put them to use in:

    Keying in information from corporate records such checks, passports, invoices, bank statements, and receipts

    License plate reading software

    At airports, for scanning passports and extracting data

    Key data extraction from insurance papers automatically

    The ability to read traffic signs

    Adding contact details from a business card

    Accelerate the process of converting printed materials into text, such scanning books for Project Gutenberg.

    Make it possible to search digital copies of printed books and magazines like Google Books.

    Using real-time handwriting recognition to operate a computer (pen computing)

    Getting around CAPTCHA anti-bot systems, even if they are meant to stop OCR. CAPTCHA anti-bot system strength testing is another possible goal.

    Equipment designed to aid the sight impaired

    Vehicle instructions are written by locating CAD pictures in a database that are relevant to the ever-evolving vehicle design.

    The process of transforming scanned documents into searchable PDFs

    Typewritten text is what optical character recognition (OCR) focuses on. It does this by analyzing each individual glyph or letter.

    Word-by-word scanning of printed text is the focus of optical word recognition (for languages that use a space as a word divider). (Commonly abbreviated to OCR)

    Handwritten printscript or cursive writing is another target of intelligent character recognition (ICR), which focuses on each individual glyph or letter.

    Word-by-word analysis of handwriting in printscript or cursive is another goal of intelligent word recognition (IWR). This is particularly helpful for languages that employ cursive writing without separating the glyphs.

    Optical character recognition is often a offline procedure that evaluates a preserved document. Online OCR APIs are offered by certain cloud-based services. It is possible to provide data about a writer's hand movements into a handwriting recognition system. This method is superior than solely relying on the forms of glyphs and words since it can record actions like the order in which parts are drawn, the direction taken, and the rhythm of when the pen is laid down and picked up. This supplementary data has the potential to improve the overall precision of the procedure. On-line character recognition, dynamic character recognition, real-time character recognition, and intelligent character recognition are all terms used to describe this technology.

    The success rate of optical character recognition (OCR) software is sometimes increased by pre-processing of pictures. Methods Can Include:

    Text lines may be made completely horizontal or vertical by using a de-skew tool if the original document was not scanned in the correct orientation.

    Despeckle: to get rid of

    Enjoying the preview?
    Page 1 of 1