FairScan 2.0 released
Version2.0 of the FairScan document-scanning app for Android has beenreleased. The headline feature for this release is the addition ofoptical-character-recognition (OCR) support using Tesseract to produce PDFswith searchable text from scans. FairScan developer Pierre-YvesNicolas has written a detailedblog about adding the feature and explaining why it had not been addedpreviously.
That looks nice, so why didn't FairScan have it before? That'sbecause FairScan wasn't ready for it: I wouldn't be comfortable ifFairScan was giving you wrong text half of the time. To get goodresults from an OCR engine, you need to provide it a readableimage. If it's hard to read for a human, it's certainly also hard toread for an OCR engine.
Over the past year, I worked on different parts of FairScan'sautomatic processing to transform photos of documents into PDFs thatare easy for humans to read:
- document detection
- perspective correction
- shadow reduction
- brightness and contrast enhancement
All this work on image processing helped FairScan produce cleanPDFs and can now also contribute to making text recognition effective.
FairScan is available via GooglePlay or F-Droid.