Confidential OCR

John

from John D. Cook on 2024-11-20 18:00 (#6SC79)

A client emailed me a screenshot of a table rather than pasting the table as text into an email.

I thought about using an LLM to convert it to text, but the table is confidential client information and so I shouldn't upload it anywhere.

I searched for a command line utility to do OCR and found tesseract. I installed it with

 sudo apt install tesseract-ocr libtesseract-dev tesseract-ocr-eng

and ran it with the default settings

 tesseract screenshot.png textfile

It worked remarkably well. I had to change a C to a U, but otherwise I didn't have to add or change any text, but I did have to delete a few extraneous parentheses generated by the software.

I work locally in part out of habit; it was the only way to work when I started using a computer. It has numerous advantages, such as being able to keep working when a hurricane knocks out my internet connection, but above all it is private.

I pay more attention to privacy than is convenient because I work in data privacy. And aside from my privacy, I have to protect our clients' privacy.

Update: According to the comments, ChatGPT uses tesseract. Assuming that's true, using tesseract directly is better than ChatGPT because it does exactly what you want. No ambiguity as far as what expected. No potential for tinkering with your results before you see them.

The post Confidential OCR first appeared on John D. Cook.

Source	RSS or Atom Feed
Feed Location	http://feeds.feedburner.com/TheEndeavour?format=xml
Feed Title	John D. Cook
Feed Link	https://www.johndcook.com/blog