Class ExtractClippedImages


  • public class ExtractClippedImages
    extends Object

    Clipped Image Extraction from PDF files


    This class provides a simple Java API to extract clipped images from a PDF file and also a static convenience method if you just want to dump all the images from a PDF file or directory containing PDF files at a set of sizes

    See our support pages for more information on extracting images.
    • Constructor Detail

      • ExtractClippedImages

        public ExtractClippedImages​(String fileName)
        Sets up an ExtractClippedImages instance to open a PDF File
        Parameters:
        fileName - full path to a single PDF file
      • ExtractClippedImages

        public ExtractClippedImages​(byte[] byteArray)
        Sets up an ExtractClippedImages instance to open a PDF file contained as a BLOB within a byte[] stream
        Parameters:
        byteArray -
    • Method Detail

      • getClippedImage

        public BufferedImage getClippedImage​(int page,
                                             int imageNumber)
                                      throws PdfException
        extract any image from any page - recommended you process images on each page in turn as quicker
        Parameters:
        page - logical page number (1 is first page)
        imageNumber - image on page (0 is first image)
        Returns:
        BufferedImage
        Throws:
        PdfException - PdfException
      • writeAllClippedImagesToDirs

        public static void writeAllClippedImagesToDirs​(String inputDir,
                                                       String outDir,
                                                       String imageType,
                                                       String[] subDirs)
                                                throws PdfException
        Convenience method to Extract all the images in a directory of PDF files
        Parameters:
        inputDir - directory of input files
        outDir - directory of output files
        subDirs - sub directory of files
        Throws:
        PdfException - PdfException
      • main

        public static void main​(String[] args)
        main routine which checks for any files passed and runs the demo
        Parameters:
        args - arguments
      • getImageCount

        public int getImageCount​(int page)
                          throws PdfException
        returns an image count for the selected page
        Parameters:
        page - logical page number
        Returns:
        int number of images (0 if no images)
        Throws:
        PdfException - PdfException
      • setPassword

        public void setPassword​(String password)
        sets the Owner or User Password to use when opening encrypted PDF file
        Parameters:
        password - the USER or OWNER password for the PDF file
      • getPageCount

        public int getPageCount()
        number of pages in PDF file (starting at 1)
        Returns:
        page count
      • openPDFFile

        public boolean openPDFFile()
                            throws PdfException
        routine to open the PDF File so we can access - needs to be checked as will be false if file cannot be opened for any reason
        Returns:
        true if successful
        Throws:
        PdfException - is problem opening file
      • closePDFfile

        public void closePDFfile()
        ensure PDF file is closed once no longer needed and all resources released