distinction between 2 different PDF files

Discussion, questions and support.
Post Reply
Message
Author
mgroen
Posts: 147
Joined: 07.02.2020, 15:03

distinction between 2 different PDF files

#1 Post by mgroen » 01.03.2021, 13:11

Hi,
I have a question (which might lead to a feature request), hopefully this will be taken seriously.

See screenshot attached.

Explanation:
I have 2 PDF files: test1.pdf, and test2.pdf.
the first one test1.pdf is a searchable pdf (which means if you open it you can search in the contents by cntrl-f key combination), the other is just a flat pdf (non-searchable).

Of course, this is just a testcase with 2 files,
It is possible for me (the user) to distinct in one view between searchable/non searchable pdf files? Like for an example an attribute in the column which displays for seachable (or something)?

The main goal behind this question is that I have lots of pdf files and I want to make them all searchable but to do that I first need to have an overview of which pdf files are already searchable and which not.

Hope it's clear what I mean otherwise I will be happy to explain more.

Image

Odamn-Ete
Posts: 270
Joined: 28.06.2017, 07:10

Re: distinction between 2 different PDF files

#2 Post by Odamn-Ete » 03.03.2021, 08:20

Hi mgroen,

I think you are mainly talking about PDFs rendered as image or with text.
I'm not familiar with a practical way of differencing between the two types.

You might want to check into tools such as Ghostscript.

Best regards,

mgroen
Posts: 147
Joined: 07.02.2020, 15:03

Re: distinction between 2 different PDF files

#3 Post by mgroen » 03.03.2021, 11:21

Odamn-Ete wrote: 03.03.2021, 08:20 Hi mgroen,

I think you are mainly talking about PDFs rendered as image or with text.
I'm not familiar with a practical way of differencing between the two types.

You might want to check into tools such as Ghostscript.

Best regards,
Yes, I am talking about PDF's which have been OCRed (Optical Character Recognized) vs image type PDF's.
I was hoping there is a way to display which type it is (OCRED or searchable) or imagetype PDF.

I just noticed: TotalCommander (another file manager) has a plugin that does seem to be able to do that. See https://totalcmd.net/plugring/pdfOCR.html
Have not tested that yet. Might be possible to have a plugin for FreeCommander which similar functionality? Or (even better) built in in FreeCommander itself?

I will post a feature request so that this might be implemented.

Odamn-Ete
Posts: 270
Joined: 28.06.2017, 07:10

Re: distinction between 2 different PDF files

#4 Post by Odamn-Ete » 04.03.2021, 02:31

mgroen wrote: 03.03.2021, 11:21 ...
I just noticed: TotalCommander (another file manager) has a plugin that does seem to be able to do that. See https://totalcmd.net/plugring/pdfOCR.html
Have not tested that yet. Might be possible to have a plugin for FreeCommander which similar functionality? Or (even better) built in in FreeCommander itself?

I will post a feature request so that this might be implemented.
Reading about the plugin, I feel confirmed that it isn't so easy. The plugin you mentioned doesn't support unicode file names. The features described sound very practical, but No unicode, no want ;-)


Best Regards,

mgroen
Posts: 147
Joined: 07.02.2020, 15:03

Re: distinction between 2 different PDF files

#5 Post by mgroen » 04.03.2021, 11:18

Odamn-Ete wrote: 04.03.2021, 02:31
mgroen wrote: 03.03.2021, 11:21 ...
I just noticed: TotalCommander (another file manager) has a plugin that does seem to be able to do that. See https://totalcmd.net/plugring/pdfOCR.html
Have not tested that yet. Might be possible to have a plugin for FreeCommander which similar functionality? Or (even better) built in in FreeCommander itself?

I will post a feature request so that this might be implemented.
Reading about the plugin, I feel confirmed that it isn't so easy. The plugin you mentioned doesn't support unicode file names. The features described sound very practical, but No unicode, no want ;-)


Best Regards,
Ok, thanks for your reply.

I hope this can be the functionality can be built-in in FC itself (I am hesitate to use plugins), thats why I have created a feature request for it. Hopefully it will be picked up soon! :)

User avatar
Dreamer
Site Admin
Posts: 6113
Joined: 19.08.2007, 23:40

Re: distinction between 2 different PDF files

#6 Post by Dreamer » 04.03.2021, 22:10

Perhaps you could use the Search and Converter, then search for any text or letter, "*" or "a" containing text, and for file *.pdf, and the files in the results panel should contain text, other pdf files should not contain text.

For searching of the text in my Office files (docx, xlsx) I need the converter. Where can I find the converter and how can I use it in FreeCommander?

Odamn-Ete
Posts: 270
Joined: 28.06.2017, 07:10

Re: distinction between 2 different PDF files

#7 Post by Odamn-Ete » 05.03.2021, 07:10

Dreamer wrote: 04.03.2021, 22:10 Perhaps you could use the Search and Converter, then search for any text or letter, "*" or "a" containing text, and for file *.pdf, and the files in the results panel should contain text, other pdf files should not contain text.

For searching of the text in my Office files (docx, xlsx) I need the converter. Where can I find the converter and how can I use it in FreeCommander?
Tried this, doesn't produce desired results. The search function will display non OCR'ed PDFs, which have the containing text in the metadata. This would mislead one to think they are OCR'ed.

Best Regards,

mgroen
Posts: 147
Joined: 07.02.2020, 15:03

Re: distinction between 2 different PDF files

#8 Post by mgroen » 05.03.2021, 13:46

To make it more clear what I need, I made a screenshot.

In short again: I need an overview of files with filenames and a mark/display if PDF file is searchable or not.

here is what I need:

Image
png plaatjes

Post Reply

Who is online

Users browsing this forum: Google [Bot] and 35 guests