Class FindTextInRectangle

java.lang.Object
org.jpedal.examples.text.FindTextInRectangle

public class FindTextInRectangle extends Object

Find text in PDF files


This class provides a simple Java API to find text in a PDF file and also a static convenience method if you want to search a PDF file or directory containing PDF files

See our Support Pages for more information on Text Searching.
  • Constructor Summary

    Constructors
    Constructor
    Description
    FindTextInRectangle(byte[] byteArray)
    Sets up an FindTextInRectangle instance to open a PDF file contained as a BLOB within a byte[] stream
    Sets up an FindTextInRectangle instance to open a PDF File
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    ensure PDF file is closed once no longer needed and all resources released
    static List<float[]>
    findTextOnAllPages(String inputDir, String textToFind)
    Convenience method to find text in a PDF file
    float[]
    findTextOnPage(int page, int x1, int y1, int x2, int y2, String textToFind, int searchType)
    Return the coords for the page specified.The origin of the coords is the bottom left hand corner (on unrotated page)
    float[]
    findTextOnPage(int page, String textToFind, int searchType)
    Return the coords for the page specified.The origin of the coords is the bottom left hand corner (on unrotated page)
    int
    number of pages in PDF file (starting at 1)
    boolean
    routine to open the PDF File so we can access
    void
    setPassword(String password)
     

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • FindTextInRectangle

      public FindTextInRectangle(String fileName)
      Sets up an FindTextInRectangle instance to open a PDF File
      Parameters:
      fileName - full path to a single PDF file
    • FindTextInRectangle

      public FindTextInRectangle(byte[] byteArray)
      Sets up an FindTextInRectangle instance to open a PDF file contained as a BLOB within a byte[] stream
      Parameters:
      byteArray - Array that will hold the BLOB
  • Method Details

    • findTextOnPage

      public float[] findTextOnPage(int page, String textToFind, int searchType) throws PdfException
      Return the coords for the page specified.The origin of the coords is the bottom left hand corner (on unrotated page)
      Parameters:
      page - :: Page number to check for results
      textToFind - test to look for
      searchType - A static int from org.jpedal.grouping.SearchType class
      Returns:
      float[] containing all coords for the page, or empty array is no results found
      [0]=result x1 coord
      [1]=result y1 coord
      [2]=result x2 coord
      [3]=result y2 coord
      [4]=either -101 to show that the next text area is the remainder of this word on another line else any other value is ignored.
      Throws:
      PdfException - PdfException
    • findTextOnPage

      public float[] findTextOnPage(int page, int x1, int y1, int x2, int y2, String textToFind, int searchType) throws PdfException
      Return the coords for the page specified.The origin of the coords is the bottom left hand corner (on unrotated page)
      Parameters:
      page - page to search
      x1 - x1
      y1 - y1
      x2 - x2
      y2 - y2
      textToFind - text to look for
      searchType - A static int from org.jpedal.grouping.SearchType class
      Returns:
      float[] containing all coords for the page, or empty array is no results found
      [0]=result x1 coord
      [1]=result y1 coord
      [2]=result x2 coord
      [3]=result y2 coord
      [4]=either -101 to show that the next text area is the remainder of this word on another line else any other value is ignored.
      Throws:
      PdfException - pdfException
    • findTextOnAllPages

      public static List<float[]> findTextOnAllPages(String inputDir, String textToFind) throws PdfException
      Convenience method to find text in a PDF file
      Parameters:
      inputDir - a PDF file
      textToFind - text to look for
      Returns:
      ArrayList containing set of float[] values for all pages (-1 for actual page) * The origin of the coords is the bottom left hand corner (on unrotated page) organised in the following order.
      [0]=result x1 coord
      [1]=result y1 coord
      [2]=result x2 coord
      [3]=result y2 coord
      [4]=either -101 to show that the next text area is the remainder of this word on another line else any other value is ignored. s
      Throws:
      PdfException - PdfException
    • setPassword

      public void setPassword(String password)
      Parameters:
      password - the USER or OWNER password for the PDF file
    • getPageCount

      public int getPageCount()
      number of pages in PDF file (starting at 1)
      Returns:
      page count
    • openPDFFile

      public boolean openPDFFile() throws PdfException
      routine to open the PDF File so we can access
      Returns:
      true if successful
      Throws:
      PdfException - if problem with opening PDF file
    • closePDFfile

      public void closePDFfile()
      ensure PDF file is closed once no longer needed and all resources released