There are times when we may want to extract text from scanned documents. Optical character recognition (OCR) is a pattern recognition technology to identify text and turn it into an editable digital document. Scanned documents are generally saved in image formats (.jpeg, .jpg, .png etc.) if not in PDF. Instead of typing text from the scanned documents, we can use Optical Character Recognition (OCR) software to extract text from scanned images. If you are looking to extract text from images, here are some software and online tools you can use to extract text from images
Table of Contents
#1) FreeOCR (Works Offline)
If you are looking software for extracting text from image as well as PDF file, try FreeOC. FreeOCR is a free Optical Character Recognition Software for Windows. Besides being an OCR software, it also supports scanning from most Twain scanners and supports popular image file formats. You can easily scan documents, and extract text from images or multi page PDF documents in one go. FreeOCR outputs plain text and can also export directly to Microsoft Word format.
- Plain Text Extraction – recognize the characters and words but ignore the formatting.
- Batch OCR – scan several pages and extract text.
- Zone OCR : Extract the text from a certain area in a document.
- Multiple Input Format: TIF, JPG, PNG, BMP, GIF, PDF.
- Multiple Language Recognition : Supports multiple languages.
- Scanner: Besides being an OCR software, it can scan documents too.
If you are looking for OCR software which can be used Offline, this is a good option. For accurate result, make sure the text in the scanned document (image) is clear without background ink or smudges etc.
#2) Simple OCR (Offline)
If you are looking for an offline OCR software which you can install on your computer, then you may also check out SimpleOCR. SimpleOCR works on any version of windows, from Windows 95-10. SimpleOCR is free for all commercial and non-commercial purposes. Some features of SimpleOCR includes:
- Plain Text Extraction – recognize the characters and words but ignore the formatting.
- Image Retention – ability to capture and retain pictures from the document.
- Batch OCR – scan several pages and extract text.
- Zone OCR : Extract the text from a certain area in a document.
- Error Highlight: Highlights errors for easier correction.
- Multiple Input Format: Accepts input from TWAIN scanners and also accepts input from TIFF files.
- Multiple Language Recognition : Supports English and French language.
- Scanner: Besides being an OCR software, it can scan documents too.
#3) Using Google Drive / Google Docs
Google Drive can also be used to extract text from image or PDF documents. The results are pretty satisfactory. If you have a Google Account, then you can also use the built in feature in Google Docs to extract text from an image or PDF. To extract text from Google Docs, do the following:
- Sign in to Google Drive with your Google Account
- Click the ‘+‘ sign and select File Upload to upload your image or PDF.
- Once the image is uploaded to Google Drive, right-click on the image uploaded.
- Click on Open With and then Click on Google Docs.
- Google Docs will attempt to open the image in Google Docs, and the in built Google Docs OCR will extract the text in the image.
- A new Google Docs will be created which includes both the extracted text from the Image and the image itself. The new Google Docs containing the extract text is created and saved in the same folder as the image.
You can copy the text and use it elsewhere. Like any other OCR, the quality depends on the quality of the source file.
NOTE: You can also extract text from PDF files in the same way with Google Docs. Just Upload the PDF file to Google Drive and open it with Google Docs. The text will be extracted below each page of the PDF.
#4) Convert images to text online
There are many OCR tools available online which you can use to extract text from images on any device. All you need is a web browser and an internet connection to start using these tools on desktop computer or mobile. However, certain websites may have some limitations such as number of times you can use their service, not of pages etc. Here are some online services you can use to convert images to text.
These online image to text conversion tools are just some of the online services for your reference. There are many other online image to text conversion tools. Some may have certain limitations. You can explore more on Google and use one which fits your requirements.
#5) Browser Extensions for extracting text from images
If you do not want to install an offline OCR software, you can also use Browser Extensions for extracting text from images. Some browser extensions also support taking screenshot of part of a webpage or whatever is on the screen and extracting text from it. Here are some browser extensions with high ratings that may be useful for extracting text from images.
To find Google Chrome Extensions related to OCR, click this link using Google Chrome Browser.
Copyfish can extract text from within any image captured from your screen into an editable format, which we can copy and later reuse in digital documents, emails or reports etc. Using this Chrome browser extension, you can take a screenshot of whatever is displayed on your browser screen, be it webpages, pictures or even videos, and convert into text, which you can copy and use. You can verify the results in one glance with the extracted text overlay.
Image Reader (OCR) extension can be used to extract words out of any image. It uses an open-source OCR library called Tesseract. To work with this addon, simply open the addon’s interface and load your image via the file selector (top section).
This extension adds a toolbar button to the browser. User can then select a region of the currently active window. The extension captures the area and tries to recognize text inside this region.