Introduction

Optical Character Recognition (OCR) is a technology that extracts or recognizes the text from digital images and scanned documents. It scraps the data using OCR on remote or virtual machines. It converts the typed or handwritten printed text into machine-encoded text—the extracted data used for the electronic business process without manually capturing it. The OCR Engines such as Google cloud OCR engine, MODI OCR engine, and Tesseract OCR engine extracts PDF data, PDF text, or text using OCR and is also used to find positions and identify documents validating data.

Introduction

Activities

Create Google Cloud OCR Engine

Create MODI OCR Engine

Create Tesseract OCR Engine

Extract PDF Data With OCR

Extract PDF Text With OCR

Extract Text With OCR

Find OCR Closest Text Position

Find OCR Text Position

Read QR Code

Identify Document Using OCR

Read Barcode

Validate OCR Data

Activities​