Package org.jpedal.examples.text
Class ExtractOutline
java.lang.Object
org.jpedal.examples.text.ExtractOutline
Outline Object Data Extraction from PDF files
This class provides a simple Java API to extract data in the Outline Data object (if present) from a PDF file as a Document object and also a static convenience method if you just want to dump all the outlines from a PDF file or directory containing PDF files
See our Support Pages for more information on Text Extraction.
-
Constructor Summary
ConstructorDescriptionExtractOutline
(byte[] byteArray) Sets up an ExtractOutline instance to open a PDF file contained as a BLOB within a byte[] streamExtractOutline
(String fileName) Sets up an ExtractOutline instance to open a PDF File -
Method Summary
Modifier and TypeMethodDescriptionvoid
ensure PDF file is closed once no longer needed and all resources releasedint
number of pages in PDF file (starting at 1)gets the Document Outline object (if present) as a Document structurestatic void
This class will allow you to extract any Outline data via command line from a single PDF file or a directory of PDF files.boolean
routine to open the PDF File so we can accessvoid
setPassword
(String password) static void
writeAllOutlinesToDir
(String input, String outputDir) Convenience method to write all the Outlines in a directory of PDF filesstatic void
writeAllOutlinesToDir
(String input, String password, String outputDir) Convenience method to write all the Outlines in a directory of PDF files
-
Constructor Details
-
ExtractOutline
Sets up an ExtractOutline instance to open a PDF File- Parameters:
fileName
- full path to a single PDF file
-
ExtractOutline
public ExtractOutline(byte[] byteArray) Sets up an ExtractOutline instance to open a PDF file contained as a BLOB within a byte[] stream- Parameters:
byteArray
- pdf file data
-
-
Method Details
-
main
This class will allow you to extract any Outline data via command line from a single PDF file or a directory of PDF files.
The example expects two:- Value 1 is the file name or directory of PDF files to process
- Value 2 is directory to write out the outline data
- Parameters:
args
- The expected arguments are described above.
-
writeAllOutlinesToDir
Convenience method to write all the Outlines in a directory of PDF files- Parameters:
input
- directory containing PDF filesoutputDir
- directory for writing out data- Throws:
PdfException
- A Pdf Exception
-
writeAllOutlinesToDir
public static void writeAllOutlinesToDir(String input, String password, String outputDir) throws PdfException Convenience method to write all the Outlines in a directory of PDF files- Parameters:
input
- file or directory containing PDF filespassword
- password to be used to open filesoutputDir
- directory for writing out data- Throws:
PdfException
- A Pdf Exception
-
getPDFTextOutline
gets the Document Outline object (if present) as a Document structure- Returns:
- Document
-
setPassword
- Parameters:
password
- the USER or OWNER password for the PDF file
-
getPageCount
public int getPageCount()number of pages in PDF file (starting at 1)- Returns:
- page count
-
openPDFFile
routine to open the PDF File so we can access- Returns:
- true if successful
- Throws:
PdfException
- if problem with opening PDF file
-
closePDFfile
public void closePDFfile()ensure PDF file is closed once no longer needed and all resources released
-