OpenNTF.org - Extract data from PDF files -
    Advanced
   OpenNTF Code Bin
Edit Document Code By Date > Code Document
About This Code
Brief Description:
Extract data from PDF files - fields, text, pages, bookmarks 
Rating:
Not Rated Yet 
Contributor:
Jason T Johnston 
Category:
Lotusscript 
Type:
Utilities 
Document Release:
1.0 
Notes Version:
R8.x, R7.x 
Last Modified:
30 Apr 2008 
OpenNTF Disclaimer

All of the program code and information presented in the OpenNTF.org Code Bin are provided "as-is", and should be used at your own risk. OpenNTF.org make no express or implied warranty about anything in the Code Bin, and OpenNTF.org will not be responsible or liable for any damage caused by the use or misuse of anything from this site. OpenNTF.org makes no guarantees about anything. Please thoroughly test all of the knowledge and code you find here before you attempt to use them in your production environment.

Code / Description
This database was created for the purpose of helping people who want to parse values from fields on a PDF form, extract text from PDF files, pull the bookmark names, and extract individual pages from a PDF file.


I pulled many of these examples from all over the web via google, and many of them from posts on www.notes.net, and I'm sure I did some searching on openntf.org. I am in no way taking credit for all of this code, but since it came from so many sources, and I did not keep up with them, I just saying it's not all mine. I simply put about 20 examples into 1 place so that we could all share.

Requirements: Built using Notes 7.0.2, also need Adobe Professional. I used Adobe Pro 7.x.. Must also have bookmarks in your PDF file. Again MUST HAVE ADOBE PROFESSIONAL.

If you have questions, commnets or suggestions, please feel free to contact me at jason@ciaresearch.com.

Be warned that this code is in no way complete. I guessed at many of the things I did in here because I could not find good and complete examples. Please share your thoughts, comments, code improvements with all of us. I will update code if I get any great ideas from people. Thanks for taking a look and I hope this helps many of you.

Jason

Usage / Example
Default view, create a document with PDF attachment, which has bookmarks, then click the "Parse PDF" button. Your results will be stored in c:\temp directory.
Code Attachments
PDFParser.nsf (596 Kbytes)
 Comments
Posted by Michael Marcavage on 05/01/2008 07:43:20 AMNicely Done
Very Nicely Done however you are relying on the user having Adobe Professional installed on the PC.
Adobe Professional cost much money. Maybe you look into using Java to do this same thing?
There is an open source jar file that you can get that should help you do this without needing Adobe Professional. The Jar file is iText, do a google search for it . I have been using it for years now and it works great. Do not need Adobe Professional.
Please understand I am by no means bashing what you have done here....
I think you did an amazing job on this.
Just offering you an alternative to needing Adobe Professional...
Posted by Jason T Johnston on 05/01/2008 04:25:47 PMSample DB please
Thanks Michael for the comments. I have seen iText in the past, but was never ablet get it working. I would love to have a Notes database that had some sample code in it. If you have something like this that would loop thru the fields, extract pages, loop thru bookmark names and grab the text off a page, I would love to use that instead (or just a small sample to just loop thru fields and then I could figure out the rest). I am not great at Java, but if I had something that I could start with, I could make it work.
Thanks,
Jason
 Add your comment!