OT: OCR / PDF Parsing

Jose Lerebours fpgroups at gmail.com
Thu Jan 13 16:11:01 PST 2022


So, from download to first parse less than 10 minutes (of course I 
already had the parsing code written) - I FREAKING LOVE IT!

+100 on your solution!  ;-)

@Laura,

While reading up on it, I came across your own posting where you 
commented about your experience ... true what they say, internet is 
forever ...


Now on to look for ways to wrap up my project!


Thank you all!!!


On 1/12/22 8:37 PM, Cesar Baquerizo wrote:
> Look for pdfsandwich. That should do what you want. Lots of info at the site.
>
> Regards
> ---------------------
>
>
>
>
> ********************************************************************
>
> This message and any attachments are solely for the intended recipient. If you are not the intended recipient, disclosure, copying, use or distribution of the information included in this message is prohibited. If you received this message in error, please notify the sender and permanently delete.
>   
>
>> On Jan 12, 2022, at 8:28 PM, Jose Lerebours via Filepro-list <filepro-list at lists.celestial.com> wrote:
>>
>> I have an GSA that wants data extracted from PDF documents, most of which are scanned
>> documents saved as PDF; which in essence makes them images saved as PDF.
>>
>> I have written code in PHP to save the PDF to PNG and extract TEXT from PNG but this is not proving
>> to be reliable since lots of characters are read wrong or not read at all.
>>
>> It is like pulling teeth, I want this done but do not ask me to get you "true" PDFs, the scanned
>> documents is all I can get ... type of scenario.
>>
>> So, my question is: is anyone here successfully extracting data from scanned documents and if so,
>> what are you using?
>>
>> Regards,
>>
>>
>> -- 
>> Jose Lerebours
>> 954-559-7186
>> https://www.asisuites.com
>> Accounting - Retail - Wholesale - Distribution
>> Manufacturing - Warehousing - Transportation - eCommerce - Web Development
>>
>> _______________________________________________
>> Filepro-list mailing list
>> Filepro-list at lists.celestial.com
>> Subscribe/Unsubscribe/Subscription Changes
>> http://mailman.celestial.com/mailman/listinfo/filepro-list

-- 
Jose Lerebours
954-559-7186
https://www.asisuites.com
Accounting - Retail - Wholesale - Distribution
Manufacturing - Warehousing - Transportation - eCommerce - Web Development



More information about the Filepro-list mailing list