Hacker News
Reed-Solomon for OCR: error correction for messy printed codes
16 points by chasangchual
ago
|
2 comments
Dwedit
|next
[-]
Do we have any screenshots of what the text looks like, and what the error correction looks like?
chasangchual
|next
|previous
[-]
I built a small Python Reed-Solomon codec aimed at OCR workflows where printed codes may be degraded by dot-matrix printers, missing pins, fading ribbons, low resolution, dirt, or partial character damage.
It uses GF(256) Reed-Solomon correction plus an OCR-safe parity representation using a reduced alphabet that avoids confusing characters like 0/O, 1/I/l, 5/S, 8/B, etc. Each parity byte is encoded as two OCR-safe characters, so printed labels or IDs can be scanned, checked, and corrected when OCR makes a limited number of symbol mistakes.
The repo includes a ReedSolomonForOcr class, OCR-safe parity helpers, a demo, and unit tests with deterministic correction scenarios.