OT: OCR / PDF Parsing
Jose Lerebours
fpgroups at gmail.com
Wed Jan 12 17:27:06 PST 2022
I have an GSA that wants data extracted from PDF documents, most of
which are scanned
documents saved as PDF; which in essence makes them images saved as PDF.
I have written code in PHP to save the PDF to PNG and extract TEXT from
PNG but this is not proving
to be reliable since lots of characters are read wrong or not read at all.
It is like pulling teeth, I want this done but do not ask me to get you
"true" PDFs, the scanned
documents is all I can get ... type of scenario.
So, my question is: is anyone here successfully extracting data from
scanned documents and if so,
what are you using?
Regards,
--
Jose Lerebours
954-559-7186
https://www.asisuites.com
Accounting - Retail - Wholesale - Distribution
Manufacturing - Warehousing - Transportation - eCommerce - Web Development
More information about the Filepro-list
mailing list