Package org.jpedal.examples.images
Class ExtractImages
java.lang.Object
org.jpedal.examples.images.ExtractImages
Image Extraction from PDF files
This class provides a simple Java API to extract images from a PDF file and also a static convenience method if you just want to dump all the images from a PDF file or directory containing PDF files.-
Constructor Summary
ConstructorDescriptionExtractImages
(byte[] byteArray) Sets up an ExtractImages instance to open a PDF file contained as a BLOB within a byte[] streamExtractImages
(String fileName) Sets up an ExtractImages instance to open a PDF File -
Method Summary
Modifier and TypeMethodDescriptionvoid
ensure PDF file is closed once no longer needed and all resources releasedgetImage
(int page, int imageNumber, boolean imageAsDisplayed) extract any image from any page - recommended you process images on each page in turn as quickerint
getImageCount
(int page) returns an image count for the selected pageorg.jpedal.objects.PdfImageData
getImageData
(int page) int
number of pages in PDF file (starting at 1)static void
This class will allow you to extract Images via command line from a single PDF file or a directory of PDF files.boolean
routine to open the PDF File so we can access - needs to be checked as will be false if file cannot be opened for any reasonvoid
setPassword
(String password) sets the Owner or User Password to use when opening encrypted PDF filestatic void
writeAllImagesToDir
(String inputDir, String outputDir, String imageType, boolean generateMetaData, boolean outputPagesInSepDirs) Convenience method to Extract all the images in a directory of PDF filesstatic void
writeAllImagesToDir
(String inputDir, String password, String outputDir, String imageType, boolean generateMetaData, boolean outputPagesInSepDirs) Convenience method to Extract all the images in a directory of PDF filesstatic void
writeAllImagesToDir
(String inputDir, String password, String outputDir, String imageType, boolean generateMetaData, boolean outputPagesInSepDirs, ErrorTracker errorTracker) Convenience method to Extract all the images in a directory of PDF files
-
Constructor Details
-
ExtractImages
Sets up an ExtractImages instance to open a PDF File- Parameters:
fileName
- full path to a single PDF file
-
ExtractImages
public ExtractImages(byte[] byteArray) Sets up an ExtractImages instance to open a PDF file contained as a BLOB within a byte[] stream- Parameters:
byteArray
- pdf file data
-
-
Method Details
-
writeAllImagesToDir
public static void writeAllImagesToDir(String inputDir, String password, String outputDir, String imageType, boolean generateMetaData, boolean outputPagesInSepDirs) throws PdfException Convenience method to Extract all the images in a directory of PDF files- Parameters:
inputDir
- directory containing PDF filespassword
- password used to open PDF filesoutputDir
- directory for writing out imagesimageType
- 3 letter value for image format to be usedgenerateMetaData
- if true include additional XML file with metadata on imageoutputPagesInSepDirs
- if true place images from each page in separate sub-directory- Throws:
PdfException
- if problem with processing PDF files
-
writeAllImagesToDir
public static void writeAllImagesToDir(String inputDir, String outputDir, String imageType, boolean generateMetaData, boolean outputPagesInSepDirs) throws PdfException Convenience method to Extract all the images in a directory of PDF files- Parameters:
inputDir
- directory containing PDF filesoutputDir
- directory for writing out imagesimageType
- 3 letter value for image format to be usedgenerateMetaData
- if true include additional XML file with metadata on imageoutputPagesInSepDirs
- if true place images from each page in separate sub-directory- Throws:
PdfException
- if problem with processing PDF files
-
writeAllImagesToDir
public static void writeAllImagesToDir(String inputDir, String password, String outputDir, String imageType, boolean generateMetaData, boolean outputPagesInSepDirs, ErrorTracker errorTracker) throws PdfException Convenience method to Extract all the images in a directory of PDF files- Parameters:
inputDir
- directory containing PDF filespassword
- password used to open PDF filesoutputDir
- directory for writing out imagesimageType
- 3 letter value for image format to be usedgenerateMetaData
- if true include additional XML file with metadata on imageoutputPagesInSepDirs
- if true place images from each page in separate sub-directoryerrorTracker
- a custom error tracker- Throws:
PdfException
- if problem with processing PDF files
-
main
This class will allow you to extract Images via command line from a single PDF file or a directory of PDF files.
The example expects three parameters:- Value 1 is the file name or directory of PDF files to process
- Value 2 is directory to write out the images
- Value 3 is image type (jpeg,tiff,png). Default is png
- Parameters:
args
- The expected arguments are described above.
-
getImage
public BufferedImage getImage(int page, int imageNumber, boolean imageAsDisplayed) throws PdfException extract any image from any page - recommended you process images on each page in turn as quicker- Parameters:
page
- logical page number (1 is first page)imageNumber
- image on page (0 is first image)imageAsDisplayed
- if true return image as displayed (with scaling/rotation) otherwise use raw stored image (often but not always the same). Neither is clipped- Returns:
- BufferedImage
- Throws:
PdfException
- if problem with extracting image from PDF file
-
getImageCount
returns an image count for the selected page- Parameters:
page
- logical page number- Returns:
- int number of images (0 if no images)
- Throws:
PdfException
- if problem with opening PDF file
-
getImageData
- Throws:
PdfException
-
setPassword
sets the Owner or User Password to use when opening encrypted PDF file- Parameters:
password
- the USER or OWNER password for the PDF file
-
getPageCount
public int getPageCount()number of pages in PDF file (starting at 1)- Returns:
- page count
-
openPDFFile
routine to open the PDF File so we can access - needs to be checked as will be false if file cannot be opened for any reason- Returns:
- true if successful
- Throws:
PdfException
- is problem opening file
-
closePDFfile
public void closePDFfile()ensure PDF file is closed once no longer needed and all resources released
-