Geekzone: technology news, blogs, forums
Guest
Welcome Guest.
You haven't logged in yet. If you don't have an account you can register now.




612 posts

Ultimate Geek

Subscriber

# 146971 4-Jun-2014 18:14
Send private message

Hi Guys.

On a regular basis I get emails with PDF attachments. These attachments are scans of Proof of Deliveries of goods delivered to my clients.
I'm trying to figure out if there is any way of saving these attachments in a folder and then have a search function that I can use to find a specific file. Each scan has a number that is unique, IE S123456, as part of the text with in the document. I am using OSX.

I trust this makes sense and explains what I'm trying to achieve. 

Thanks.

Create new topic
5654 posts

Uber Geek


  # 1059295 4-Jun-2014 19:04
Send private message

Spotlight in OS X should search file contents automatically. What is the content of the PDFs though - do they contain actual text, or just an image of a page?

2547 posts

Uber Geek


  # 1059310 4-Jun-2014 19:21
Send private message

How the PDFs are generated will be a big factor, as RunningMan pointed out, if the content is just a scanned image, rather than text data, what you are asking would require some fairly in-depth OCR functionality. Would it be more practical to save the documents in such a way their their unique identifier is part of the filename? This would then be easily searchable.

 
 
 
 




612 posts

Ultimate Geek

Subscriber

  # 1059319 4-Jun-2014 19:34
Send private message

RunningMan: Spotlight in OS X should search file contents automatically. What is the content of the PDFs though - do they contain actual text, or just an image of a page?


It's just an image of a page.



612 posts

Ultimate Geek

Subscriber

  # 1059320 4-Jun-2014 19:36
Send private message

Inphinity: How the PDFs are generated will be a big factor, as RunningMan pointed out, if the content is just a scanned image, rather than text data, what you are asking would require some fairly in-depth OCR functionality. Would it be more practical to save the documents in such a way their their unique identifier is part of the filename? This would then be easily searchable.


I have just saved some with the Unique Identifier as part of the file name. It works but if I have 20 scans with S123456 etc as the unique identifier, the file name will get very long and time consuming to save in this manner. lol.

5654 posts

Uber Geek


  # 1059324 4-Jun-2014 19:45
Send private message

What about the email subject line or body - does that have anything useful?



612 posts

Ultimate Geek

Subscriber

  # 1059341 4-Jun-2014 19:54
Send private message

RunningMan: What about the email subject line or body - does that have anything useful?


No.

Subject line always say...POD's

Body always says....Hello Dxxxxx. Please find POD's attached. Thanks Bxxxxxxx.

22898 posts

Uber Geek

Trusted
Subscriber

  # 1059345 4-Jun-2014 20:05
Send private message

you need a proper document storage system, not just a bunch of files in email. as a makeshift option, I think acrobat on its own can ocr things which should then make it searchable by the OS search tools.




Richard rich.ms

 
 
 
 


5654 posts

Uber Geek


  # 1059346 4-Jun-2014 20:07
Send private message

Unless you can change the format that it gets sent in - either the PDF isn't just a scanned image, or the email has some useful info in it - then you'll some software to OCR the PDF so it is searchable.



612 posts

Ultimate Geek

Subscriber

  # 1059365 4-Jun-2014 20:11
Send private message

RunningMan: Unless you can change the format that it gets sent in - either the PDF isn't just a scanned image, or the email has some useful info in it - then you'll some software to OCR the PDF so it is searchable.


It's an automated process from the carrier. They load the signed waybills into their scanner and then when it's finished the email is sent out automaticly.

2691 posts

Uber Geek

Trusted
Lifetime subscriber

  # 1059537 5-Jun-2014 08:33
Send private message

Ask the carrier to look at whether they can OCR the PDFs.  If they are scanning via a photocopier to PDF, some of the copier vendors have basic document management solutions which include the OCRing of files.

3rd party options would monitor a folder for PDFs, OCR them, and then move the result into a different folder.  You could also do this at your end.  If there are high volumes you can script the detaching of PDFs to a folder.




"4 wheels move the body.  2 wheels move the soul."

“Don't believe anything you read on the net. Except this. Well, including this, I suppose.” Douglas Adams

Create new topic



Twitter and LinkedIn »



Follow us to receive Twitter updates when new discussions are posted in our forums:



Follow us to receive Twitter updates when news items and blogs are posted in our frontpage:



Follow us to receive Twitter updates when tech item prices are listed in our price comparison site:





News »

Ring launches indoor-only security camera
Posted 23-Jan-2020 17:26


New report findings will help schools implement the digital technologies curriculum content
Posted 23-Jan-2020 17:25


N4L to upgrade & support wireless internet inside schools
Posted 23-Jan-2020 17:22


Netflix releases 21 Studio Ghibli works
Posted 22-Jan-2020 11:42


Vodafone integrates eSIM into device and wearable roadmap
Posted 17-Jan-2020 09:45


Do you need this camera app? Group investigates privacy implications
Posted 16-Jan-2020 03:30


JBL launches headphones range designed for gaming
Posted 13-Jan-2020 09:59


Withings introduces ScanWatch wearable combining ECG and sleep apnea detection
Posted 9-Jan-2020 18:34


NZ Police releases public app
Posted 8-Jan-2020 11:43


Suunto 7 combine sports and smart features on new smartwatch generation
Posted 7-Jan-2020 16:06


Intel brings innovation with technology spanning the cloud, network, edge and PC
Posted 7-Jan-2020 15:54


AMD announces high performance desktop and ultrathin laptop processors
Posted 7-Jan-2020 15:42


AMD unveils four new desktop and mobile GPUs including AMD Radeon RX 5600
Posted 7-Jan-2020 15:32


Consolidation in video streaming market with Spark selling Lightbox to Sky
Posted 19-Dec-2019 09:09


Intel introduces cryogenic control chip to enable quantum computers
Posted 10-Dec-2019 21:32



Geekzone Live »

Try automatic live updates from Geekzone directly in your browser, without refreshing the page, with Geekzone Live now.


Support Geekzone »

Our community of supporters help make Geekzone possible. Click the button below to join them.

Support Geezone on PressPatron



Are you subscribed to our RSS feed? You can download the latest headlines and summaries from our stories directly to your computer or smartphone by using a feed reader.

Alternatively, you can receive a daily email with Geekzone updates.