Class ExtractOutline

java.lang.Object
org.jpedal.examples.BaseExample
org.jpedal.examples.text.ExtractOutline

public class ExtractOutline extends org.jpedal.examples.BaseExample

Outline Object Data Extraction from PDF files


This class provides a simple Java API to extract data in the Outline Data object (if present) from a PDF file as a Document object and also a static convenience method if you just want to dump all the outlines from a PDF file or directory containing PDF files

See our Support Pages for more information on Text Extraction.
  • Constructor Summary

    Constructors
    Constructor
    Description
    ExtractOutline(byte[] byteArray)
    Sets up an ExtractOutline instance to open a PDF file contained as a BLOB within a byte[] stream
    Sets up an ExtractOutline instance to open a PDF File
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    decodeFile(String file_name)
    routine to decode a file
    int
    number of pages in PDF file (starting at 1)
    gets the Document Outline object (if present) as a Document structure
    static void
    main(String[] args)
    This class will allow you to extract any Outline data via command line from a single PDF file or a directory of PDF files.
    void
    setPassword(String password)
     
    static void
    writeAllOutlinesToDir(String input, String outputDir)
    Convenience method to write all the Outlines in a directory of PDF files
    static void
    writeAllOutlinesToDir(String input, String password, String outputDir)
    Convenience method to write all the Outlines in a directory of PDF files

    Methods inherited from class org.jpedal.examples.BaseExample

    closePDFfile, openPDFFile

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • ExtractOutline

      public ExtractOutline(String fileName)
      Sets up an ExtractOutline instance to open a PDF File
      Parameters:
      fileName - full path to a single PDF file
    • ExtractOutline

      public ExtractOutline(byte[] byteArray)
      Sets up an ExtractOutline instance to open a PDF file contained as a BLOB within a byte[] stream
      Parameters:
      byteArray - pdf file data
  • Method Details

    • decodeFile

      public void decodeFile(String file_name) throws PdfException
      routine to decode a file
      Throws:
      PdfException
    • main

      public static void main(String[] args)
      This class will allow you to extract any Outline data via command line from a single PDF file or a directory of PDF files.
      The example expects two:
      • Value 1 is the file name or directory of PDF files to process
      • Value 2 is directory to write out the outline data
      Parameters:
      args - The expected arguments are described above.
    • writeAllOutlinesToDir

      public static void writeAllOutlinesToDir(String input, String outputDir) throws PdfException
      Convenience method to write all the Outlines in a directory of PDF files
      Parameters:
      input - directory containing PDF files
      outputDir - directory for writing out data
      Throws:
      PdfException - A Pdf Exception
    • writeAllOutlinesToDir

      public static void writeAllOutlinesToDir(String input, String password, String outputDir) throws PdfException
      Convenience method to write all the Outlines in a directory of PDF files
      Parameters:
      input - file or directory containing PDF files
      password - password to be used to open files
      outputDir - directory for writing out data
      Throws:
      PdfException - A Pdf Exception
    • getPDFTextOutline

      public Document getPDFTextOutline()
      gets the Document Outline object (if present) as a Document structure
      Returns:
      Document
    • setPassword

      public void setPassword(String password)
      Parameters:
      password - the USER or OWNER password for the PDF file
    • getPageCount

      public int getPageCount()
      number of pages in PDF file (starting at 1)
      Returns:
      page count