Geekzone: technology news, blogs, forums
Guest
Welcome Guest.
You haven't logged in yet. If you don't have an account you can register now.


506 posts

Ultimate Geek
+1 received by user: 12

Subscriber

Topic # 146971 4-Jun-2014 18:14
Send private message

Hi Guys.

On a regular basis I get emails with PDF attachments. These attachments are scans of Proof of Deliveries of goods delivered to my clients.
I'm trying to figure out if there is any way of saving these attachments in a folder and then have a search function that I can use to find a specific file. Each scan has a number that is unique, IE S123456, as part of the text with in the document. I am using OSX.

I trust this makes sense and explains what I'm trying to achieve. 

Thanks.

Create new topic
4843 posts

Uber Geek
+1 received by user: 1504


  Reply # 1059295 4-Jun-2014 19:04
Send private message

Spotlight in OS X should search file contents automatically. What is the content of the PDFs though - do they contain actual text, or just an image of a page?

2503 posts

Uber Geek
+1 received by user: 928

Subscriber

  Reply # 1059310 4-Jun-2014 19:21
Send private message

How the PDFs are generated will be a big factor, as RunningMan pointed out, if the content is just a scanned image, rather than text data, what you are asking would require some fairly in-depth OCR functionality. Would it be more practical to save the documents in such a way their their unique identifier is part of the filename? This would then be easily searchable.




Windows 7 x64 // i5-3570K // 16GB DDR3-1600 // GTX660Ti 2GB // Samsung 830 120GB SSD // OCZ Agility4 120GB SSD // Samsung U28D590D @ 3840x2160 & Asus PB278Q @ 2560x1440
Samsung Galaxy S5 SM-G900I w/Spark



506 posts

Ultimate Geek
+1 received by user: 12

Subscriber

  Reply # 1059319 4-Jun-2014 19:34
Send private message

RunningMan: Spotlight in OS X should search file contents automatically. What is the content of the PDFs though - do they contain actual text, or just an image of a page?


It's just an image of a page.



506 posts

Ultimate Geek
+1 received by user: 12

Subscriber

  Reply # 1059320 4-Jun-2014 19:36
Send private message

Inphinity: How the PDFs are generated will be a big factor, as RunningMan pointed out, if the content is just a scanned image, rather than text data, what you are asking would require some fairly in-depth OCR functionality. Would it be more practical to save the documents in such a way their their unique identifier is part of the filename? This would then be easily searchable.


I have just saved some with the Unique Identifier as part of the file name. It works but if I have 20 scans with S123456 etc as the unique identifier, the file name will get very long and time consuming to save in this manner. lol.

4843 posts

Uber Geek
+1 received by user: 1504


  Reply # 1059324 4-Jun-2014 19:45
Send private message

What about the email subject line or body - does that have anything useful?



506 posts

Ultimate Geek
+1 received by user: 12

Subscriber

  Reply # 1059341 4-Jun-2014 19:54
Send private message

RunningMan: What about the email subject line or body - does that have anything useful?


No.

Subject line always say...POD's

Body always says....Hello Dxxxxx. Please find POD's attached. Thanks Bxxxxxxx.

21121 posts

Uber Geek
+1 received by user: 4210

Trusted
Subscriber

  Reply # 1059345 4-Jun-2014 20:05
Send private message

you need a proper document storage system, not just a bunch of files in email. as a makeshift option, I think acrobat on its own can ocr things which should then make it searchable by the OS search tools.




Richard rich.ms

4843 posts

Uber Geek
+1 received by user: 1504


  Reply # 1059346 4-Jun-2014 20:07
Send private message

Unless you can change the format that it gets sent in - either the PDF isn't just a scanned image, or the email has some useful info in it - then you'll some software to OCR the PDF so it is searchable.



506 posts

Ultimate Geek
+1 received by user: 12

Subscriber

  Reply # 1059365 4-Jun-2014 20:11
Send private message

RunningMan: Unless you can change the format that it gets sent in - either the PDF isn't just a scanned image, or the email has some useful info in it - then you'll some software to OCR the PDF so it is searchable.


It's an automated process from the carrier. They load the signed waybills into their scanner and then when it's finished the email is sent out automaticly.

2382 posts

Uber Geek
+1 received by user: 694

Trusted
Lifetime subscriber

  Reply # 1059537 5-Jun-2014 08:33
Send private message

Ask the carrier to look at whether they can OCR the PDFs.  If they are scanning via a photocopier to PDF, some of the copier vendors have basic document management solutions which include the OCRing of files.

3rd party options would monitor a folder for PDFs, OCR them, and then move the result into a different folder.  You could also do this at your end.  If there are high volumes you can script the detaching of PDFs to a folder.




"4 wheels move the body.  2 wheels move the soul."

“Don't believe anything you read on the net. Except this. Well, including this, I suppose.” Douglas Adams

Create new topic

Twitter »

Follow us to receive Twitter updates when new discussions are posted in our forums:



Follow us to receive Twitter updates when news items and blogs are posted in our frontpage:



Follow us to receive Twitter updates when tech item prices are listed in our price comparison site:





News »

N4L helping TAKA Trust bridge the digital divide for Lower Hutt students
Posted 18-Jun-2018 13:08


Winners Announced for 2018 CIO Awards
Posted 18-Jun-2018 13:03


Logitech Rally sets new standard for USB-connected video conference cameras
Posted 18-Jun-2018 09:27


Russell Stanners steps down as Vodafone NZ CEO
Posted 12-Jun-2018 09:13


Intergen recognised as 2018 Microsoft Country Partner of the Year for New Zealand
Posted 12-Jun-2018 08:00


Finalists Announced For Microsoft NZ Partner Awards
Posted 6-Jun-2018 15:12


Vocus Group and Vodafone announce joint venture to accelerate fibre innovation
Posted 5-Jun-2018 10:52


Kogan.com to launch Kogan Mobile in New Zealand
Posted 4-Jun-2018 14:34


Enable doubles fibre broadband speeds for its most popular wholesale service in Christchurch
Posted 2-Jun-2018 20:07


All or Nothing: New Zealand All Blacks arrives on Amazon Prime Video
Posted 2-Jun-2018 16:21


Innovation Grant, High Tech Awards and new USA office for Kiwi tech company SwipedOn
Posted 1-Jun-2018 20:54


Commerce Commission warns Apple for misleading consumers about their rights
Posted 30-May-2018 13:15


IBM leads Call for Code to use cloud, data, AI, blockchain for natural disaster relief
Posted 25-May-2018 14:12


New FUJIFILM X-T100 aims to do better job than smartphones
Posted 24-May-2018 20:17


Stuff takes 100% ownership of Stuff Fibre
Posted 24-May-2018 19:41



Geekzone Live »

Try automatic live updates from Geekzone directly in your browser, without refreshing the page, with Geekzone Live now.



Are you subscribed to our RSS feed? You can download the latest headlines and summaries from our stories directly to your computer or smartphone by using a feed reader.

Alternatively, you can receive a daily email with Geekzone updates.