Class ConvertPagesToHiResImages


  • public final class ConvertPagesToHiResImages
    extends Object

    Image Extraction from PDF files


    This class provides a simple Java API to convert pages in a PDF files into images and also a static convenience method if you just want to dump all the pages as images from a PDF file or directory containing PDF files

    Example 1 - access API methods

    
     ConvertPagesToHiResImages extract=new ConvertPagesToHiResImages("C:/pdfs/mypdf.pdf");
     //extract.setPassword("password");
     HashMap options=new HashMap(); //see https://javadoc.idrsolutions.com/org/jpedal/constants/JPedalSettings.html
     if (extract.openPDFFile()) {
         int pageCount=extract.getPageCount();
         for (int page=1; page<=pageCount; page++) {
    
            BufferedImage image=extract.getPageAsHiResImage(page, isBackgroundTransparent,  options);
         }
     }
    
     extract.closePDFfile();
     

    Example 2 - convenience static method

    Convert all pages as images with options to create higher resolution output

    
     ConvertPagesToHiResImages.writeAllPagesAsHiResImagesToDir("pdfs", "images" , "png", options);
     

    Example 3 - Access directly from the Jar

    ExtractImages can run from jar directly using the command and will extract all files from a PDF file or directory to a defined output directory:

    java -cp libraries_needed org/jpedal/examples/images/ConvertPagesToHiResImages inputValues

    Where inputValues is 3 values:
    • First value: The PDF filename (including the path if needed) or a directory containing PDF files. If it contains spaces it must be enclosed by double quotes (ie "C:/Path with spaces/").
    • Second value: The location to write out images extracted from the PDF file or files. If it contains spaces it must be enclosed by double quotes (ie "C:/Path with spaces/").
    • Third value: This indicates the required output image type (default is png if nothing specified). Options are tiff, bmp, png, jpg.

    There is another example (org.jpedal.examples.images.ConvertPagesToImages) for producing images of pages if extra features not needed.
    Click here for a list of code examples to convert images
    • Constructor Detail

      • ConvertPagesToHiResImages

        public ConvertPagesToHiResImages​(String fileName)
        Sets up an ConvertPagesToHiResImages instance to open a PDF File
        Parameters:
        fileName - full path to a single PDF file
      • ConvertPagesToHiResImages

        public ConvertPagesToHiResImages​(byte[] byteArray)
        Sets up an ConvertPagesToHiResImages instance to open a PDF file contained as a BLOB within a byte[] stream
        Parameters:
        byteArray -
    • Method Detail

      • main

        public static void main​(String[] args)
      • writeAllPagesAsHiResImagesToDir

        public static void writeAllPagesAsHiResImagesToDir​(String inputDir,
                                                           String outputDir,
                                                           String format)
                                                    throws PdfException
        static method to write out all pages in a PDF files or directory of PDF files as images Not for use with other image conversion methods in multi-threaded environments. This method utilises some variables that may impact image conversion taking place on other threads.
        Parameters:
        inputDir - directory of files to convert
        outputDir - directory of output
        format - format of images
        Throws:
        PdfException - PdfException
      • setPassword

        public void setPassword​(String password)
        sets the Owner or User Password to use when opening encrypted PDF file
        Parameters:
        password - the USER or OWNER password for the PDF file
      • getPageCount

        public int getPageCount()
        number of pages in PDF file (starting at 1)
        Returns:
        page count
      • openPDFFile

        public boolean openPDFFile()
                            throws PdfException
        routine to open the PDF File so we can access - needs to be checked as will be false if file cannot be opened for any reason
        Returns:
        true if successful
        Throws:
        PdfException - is problem opening file
      • closePDFfile

        public void closePDFfile()
        ensure PDF file is closed once no longer needed and all resources released