OT: Help getting PDF to OCR or searchable form

Laura Brody laura.k.brody at gmail.com
Mon Sep 9 19:35:30 PDT 2019


I found a list of Linux flavors that PDFsandwhich has been ported to and
Raspberrian Linux was on the list!

I will be be working on this project tomorrow. Thank you so much for this
lead. I don't think I would have found it by myself.

Laura Brody

On Mon, Sep 9, 2019 at 10:27 PM Laura Brody <laura.k.brody at gmail.com> wrote:

> This is very interesting.
>
> The only Linux box I have running at the moment is Raspberry Pi 3 B+. I
> have 64GB SD card available, so space isn't an issue. Any idea if it will
> work on it?
>
> Laura Brody
>
> On Mon, Sep 9, 2019 at 9:54 PM Cesar Baquerizo <ces at cescom.com> wrote:
>
>> Lookup Tesseract and Pdfsandwich. It may help you.
>>
>> Get Outlook for iOS <https://aka.ms/o0ukef>
>>
>> ------------------------------
>> *From:* Filepro-list <filepro-list-bounces+ces=
>> cescom.com at lists.celestial.com> on behalf of Laura Brody via
>> Filepro-list <filepro-list at lists.celestial.com>
>> *Sent:* Monday, September 9, 2019 9:50 PM
>> *To:* Filepro_List
>> *Cc:* Laura Brody
>> *Subject:* Re: OT: Help getting PDF to OCR or searchable form
>>
>> Additional information....
>>
>> I talked to the user and got some history...
>>
>> The user scanned in legal documents. Saved the images as pages in a PDF.
>> That is why I can't search on keywords for most of the files. A few files
>> were typed up and then exported as PDF. most are images of the pages.
>> That
>> means that OCR has to be part of the solution.
>>
>> I discovered that Adobe Acobat Reader has a setting to search all PDFs in
>> a
>> directory for keywords. The problem is that these files don't contain
>> text.
>> They contain images of text. Adobe can't search images and find keywords.
>>
>> Laura Brody
>>
>> On Mon, Sep 9, 2019 at 8:03 PM Laura Brody <laura.k.brody at gmail.com>
>> wrote:
>>
>> > I am hoping that one of you has solved this problem before.....
>> >
>> > I have over a thousand pages of text in a dozen or so PDF files. Most
>> > files are "read-only" and I can not do Ctrl-F to search for keywords. I
>> > would like to be able to OCR the files and put everything into one file
>> > that is searchable. Or is there a utility that will search all of the
>> PDFs
>> > in a directory for a keyword?
>> >
>> > Suggestions anyone?
>> >
>> > Laura Brody
>> >
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> http://mailman.celestial.com/pipermail/filepro-list/attachments/20190909/935e0f40/attachment.html>
>>
>> _______________________________________________
>> Filepro-list mailing list
>> Filepro-list at lists.celestial.com
>> Subscribe/Unsubscribe/Subscription Changes
>> http://mailman.celestial.com/mailman/listinfo/filepro-list
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.celestial.com/pipermail/filepro-list/attachments/20190909/a743aaa2/attachment.html>


More information about the Filepro-list mailing list