Package org.jpedal.examples.images
Class ExtractImages
- java.lang.Object
-
- org.jpedal.examples.images.ExtractImages
-
public class ExtractImages extends Object
Image Extraction from PDF files
This class provides a simple Java API to extract images from a PDF file and also a static convenience method if you just want to dump all the images from a PDF file or directory containing PDF files.
See our Support Pages for more info on Image Extraction.
-
-
Constructor Summary
Constructors Constructor Description ExtractImages(byte[] byteArray)
Sets up an ExtractImages instance to open a PDF file contained as a BLOB within a byte[] streamExtractImages(String fileName)
Sets up an ExtractImages instance to open a PDF File
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
closePDFfile()
ensure PDF file is closed once no longer needed and all resources releasedBufferedImage
getImage(int page, int imageNumber, boolean imageAsDisplayed)
extract any image from any page - recommended you process images on each page in turn as quickerint
getImageCount(int page)
returns an image count for the selected pageint
getPageCount()
number of pages in PDF file (starting at 1)static void
main(String[] args)
This class will allow you to extract Images via command line from a single PDF file or a directory of PDF files.boolean
openPDFFile()
routine to open the PDF File so we can access - needs to be checked as will be false if file cannot be opened for any reasonvoid
setPassword(String password)
sets the Owner or User Password to use when opening encrypted PDF filestatic void
writeAllImagesToDir(String inputDir, String outputDir, String imageType, boolean generateMetaData, boolean outputPagesInSepDirs)
Convenience method to Extract all the images in a directory of PDF filesstatic void
writeAllImagesToDir(String inputDir, String password, String outputDir, String imageType, boolean generateMetaData, boolean outputPagesInSepDirs)
Convenience method to Extract all the images in a directory of PDF files
-
-
-
Constructor Detail
-
ExtractImages
public ExtractImages(String fileName)
Sets up an ExtractImages instance to open a PDF File- Parameters:
fileName
- full path to a single PDF file
-
ExtractImages
public ExtractImages(byte[] byteArray)
Sets up an ExtractImages instance to open a PDF file contained as a BLOB within a byte[] stream- Parameters:
byteArray
- pdf file data
-
-
Method Detail
-
writeAllImagesToDir
public static void writeAllImagesToDir(String inputDir, String password, String outputDir, String imageType, boolean generateMetaData, boolean outputPagesInSepDirs) throws PdfException
Convenience method to Extract all the images in a directory of PDF files- Parameters:
inputDir
- directory containing PDF filespassword
- password used to open PDF filesoutputDir
- directory for writing out imagesimageType
- 3 letter value for image format to be usedgenerateMetaData
- if true include additional XML file with metadata on imageoutputPagesInSepDirs
- if true place images from each page in separate sub-directory- Throws:
PdfException
- if problem with processing PDF files
-
writeAllImagesToDir
public static void writeAllImagesToDir(String inputDir, String outputDir, String imageType, boolean generateMetaData, boolean outputPagesInSepDirs) throws PdfException
Convenience method to Extract all the images in a directory of PDF files- Parameters:
inputDir
- directory containing PDF filesoutputDir
- directory for writing out imagesimageType
- 3 letter value for image format to be usedgenerateMetaData
- if true include additional XML file with metadata on imageoutputPagesInSepDirs
- if true place images from each page in separate sub-directory- Throws:
PdfException
- if problem with processing PDF files
-
main
public static void main(String[] args)
This class will allow you to extract Images via command line from a single PDF file or a directory of PDF files.
The example expects three parameters:- Value 1 is the file name or directory of PDF files to process
- Value 2 is directory to write out the images
- Value 3 is image type (jpeg,tiff,png). Default is png
- Parameters:
args
- The expected arguments are described above.
-
getImage
public BufferedImage getImage(int page, int imageNumber, boolean imageAsDisplayed) throws PdfException
extract any image from any page - recommended you process images on each page in turn as quicker- Parameters:
page
- logical page number (1 is first page)imageNumber
- image on page (0 is first image)imageAsDisplayed
- if true return image as displayed (with scaling/rotation) otherwise use raw stored image (often but not always the same). Neither is clipped- Returns:
- BufferedImage
- Throws:
PdfException
- if problem with extracting image from PDF file
-
getImageCount
public int getImageCount(int page) throws PdfException
returns an image count for the selected page- Parameters:
page
- logical page number- Returns:
- int number of images (0 if no images)
- Throws:
PdfException
- if problem with opening PDF file
-
setPassword
public void setPassword(String password)
sets the Owner or User Password to use when opening encrypted PDF file- Parameters:
password
- the USER or OWNER password for the PDF file
-
getPageCount
public int getPageCount()
number of pages in PDF file (starting at 1)- Returns:
- page count
-
openPDFFile
public boolean openPDFFile() throws PdfException
routine to open the PDF File so we can access - needs to be checked as will be false if file cannot be opened for any reason- Returns:
- true if successful
- Throws:
PdfException
- is problem opening file
-
closePDFfile
public void closePDFfile()
ensure PDF file is closed once no longer needed and all resources released
-
-