Package org.jpedal.examples.acroform
Class ExtractEmbeddedFiles
java.lang.Object
org.jpedal.examples.BaseExample
org.jpedal.examples.acroform.ExtractEmbeddedFiles
public class ExtractEmbeddedFiles
extends org.jpedal.examples.BaseExample
File Extraction from PDF files
This class provides a simple Java API to extract embedded files and file attachments from a PDF file and also a static convenience method if you just want to dump all files from a PDF file or directory containing PDF files. All files are extracted to a folder at the given output location with a name matching the pdf filename
Example 1 - access API methods
ExtractEmbeddedFiles extract=new ExtractEmbeddedFiles("C:/pdfs/mypdf.pdf");
//extract.setPassword("password");
if (extract.openPDFFile()) {
if (extract.containsEmbeddedFiles()) {
extract.extractEmbeddedFiles("C:/output/");
}
if (extract.containsFilesAttachments()) {
extract.extractFileAttachments("C:/output");
}
}
extract.closePDFfile();
Example 2 - convenience static method
Extract all embedded files and file attachments from a pdf
ExtractEmbeddedFiles.extractAllFilesFromPdf("C:/pdfs/mypdf.pdf", "C:/output");
Example 3 - Access directly from the Jar
ExtractEmbeddedFiles can run from jar directly using the command and will extract all embedded files and file attachments from a PDF file or directory to a defined output directory:java -cp libraries_needed org/jpedal/examples/acroform/ExtractEmbeddedFiles inputValues
Where inputValues is 3 values:
- First value: The PDF filename (including the path if needed) or a directory containing PDF files. If it contains spaces it must be enclosed by double quotes (ie "C:/Path with spaces/").
- Second value: The location to write out extracted files from the PDF file or files. If it contains spaces it must be enclosed by double quotes (ie "C:/Path with spaces/").
-
Constructor Summary
ConstructorDescriptionExtractEmbeddedFiles
(byte[] byteArray) ExtractEmbeddedFiles
(String fileName) -
Method Summary
Modifier and TypeMethodDescriptionboolean
Method to flag if the current file contains embedded files.boolean
Method to flag if the current file contains file attachmentsextractAllEmbeddedFilesAsMap
(String inputFilename) extractAllFileAttachmentFilesOnPage
(int page) extractAllFileAttachmentsAsMap
(String inputFilename) extractAllFileAttachmentsOnPageAsMap
(String inputFilename, int page) static void
extractAllFilesFromPdf
(String inputDir, String outputDir) static method to write out all pages in a PDF files or directory of PDF files as imagesbyte[]
extractEmbeddedFile
(String requestedFile) void
extractEmbeddedFiles
(String outputDirectory) Extract embedded files and place them in the output directory specified.byte[]
extractFileAttachment
(String requestedFile) void
extractFileAttachments
(String outputDirectory) Extract files from file attachment annotations in the open file and place them in the output directory specified.String[]
String[]
static void
void
setPassword
(String password) void
void
Methods inherited from class org.jpedal.examples.BaseExample
closePDFfile, openPDFFile
-
Constructor Details
-
ExtractEmbeddedFiles
-
ExtractEmbeddedFiles
public ExtractEmbeddedFiles(byte[] byteArray)
-
-
Method Details
-
main
-
setPassword
- Parameters:
password
- the USER or OWNER password for the PDF file
-
containsFilesAttachments
public boolean containsFilesAttachments()Method to flag if the current file contains file attachments- Returns:
- True is file attachments are present, otherwise false.
-
extractFileAttachments
Extract files from file attachment annotations in the open file and place them in the output directory specified. A directory is placed in the given directory, the name is that of the pdf and it contains all extracted files. When extracting the files any existing files of the same name will be replaced. This does not extract files contained within the EmbeddedFiles dictionary (such as those found in Portfolios).- Parameters:
outputDirectory
- Path where the extract files should be saved.
-
extractAllEmbeddedFilesAsMap
public static Map<String,byte[]> extractAllEmbeddedFilesAsMap(String inputFilename) throws PdfException - Throws:
PdfException
-
extractEmbeddedFile
-
getEmbeddedFileNames
-
extractAllFileAttachmentsAsMap
public static Map<String,byte[]> extractAllFileAttachmentsAsMap(String inputFilename) throws PdfException - Throws:
PdfException
-
extractAllFileAttachmentsOnPageAsMap
public static Map<String,byte[]> extractAllFileAttachmentsOnPageAsMap(String inputFilename, int page) throws PdfException - Throws:
PdfException
-
extractFileAttachment
-
extractAllFileAttachmentFilesOnPage
-
getFileAttachmentNames
-
containsEmbeddedFiles
public boolean containsEmbeddedFiles()Method to flag if the current file contains embedded files.- Returns:
- True is embedded files are present, otherwise false.
-
extractEmbeddedFiles
Extract embedded files and place them in the output directory specified. A directory is placed in the given directory, the name is that of the pdf and it contains all extracted files. When extracting the files any existing files of the same name will be replaced. This does not extract files contained within File Attachment annotations.- Parameters:
outputDirectory
- Path where the extracted files should be saved.
-
extractAllFilesFromPdf
static method to write out all pages in a PDF files or directory of PDF files as images- Parameters:
inputDir
- directory of files to convertoutputDir
- directory of output- Throws:
PdfException
- PdfException
-
showEmbeddedFilesDetails
public void showEmbeddedFilesDetails() -
showFileAttachmentDetails
public void showFileAttachmentDetails()
-