Foxit PDF SDK
com.foxit.sdk.addon.ocr.OCR Class Reference
Inheritance diagram for com.foxit.sdk.addon.ocr.OCR:
com.foxit.sdk.common.Base

Public Member Functions

 OCR () throws com.foxit.sdk.PDFException
 Constructor.
 
 OCR (OCR other)
 Constructor, with another ocr object. More...
 
synchronized void delete ()
 Clean up related resources immediately. More...
 
OCRSuspectInfoArray getOCRSuspectsInfo (PDFDoc ocred_pdf_doc) throws com.foxit.sdk.PDFException
 Get OCR suspicious information. More...
 
boolean isEmpty ()
 Check whether current object is empty or not. More...
 
void oCRConvertTo (int format, String src_pdf_path, String password, String saved_file_path, Range page_range, boolean is_retain_flowing_text) throws com.foxit.sdk.PDFException
 OCR the PDF document and convert it to a specified format document. More...
 
void oCRConvertTo (int format, String src_pdf_path, String password, String saved_file_path, Range page_range, boolean is_retain_flowing_text, OCRConfig config) throws com.foxit.sdk.PDFException
 OCR the PDF document and convert it to a specified format document. More...
 
void oCRPDFDocument (PDFDoc pdf_doc, boolean is_editable) throws com.foxit.sdk.PDFException
 OCR each page of a PDF document. More...
 
void oCRPDFDocument (PDFDoc pdf_doc, boolean is_editable, OCRConfig config) throws com.foxit.sdk.PDFException
 OCR each page of a PDF document. More...
 
void oCRPDFDocuments (OCRSettingDataArray settingdata_array) throws com.foxit.sdk.PDFException
 OCR multiple pages of multiple PDF documents. More...
 
void oCRPDFPage (PDFPage pdf_page, boolean is_editable) throws com.foxit.sdk.PDFException
 OCR a PDF page. More...
 
void oCRPDFPage (PDFPage pdf_page, boolean is_editable, OCRConfig config) throws com.foxit.sdk.PDFException
 OCR a PDF page. More...
 
- Public Member Functions inherited from com.foxit.sdk.common.Base
synchronized void delete ()
 Clean up related resources immediately. More...
 

Static Public Attributes

static final int e_OCRConvertFormatDOC = 1
 OCR convert format: DOC.
 
static final int e_OCRConvertFormatDOCX = 0
 OCR convert format: DOCX.
 
static final int e_OCRConvertFormatHTML = 6
 OCR convert format: HTML.
 
static final int e_OCRConvertFormatPPTX = 5
 OCR convert format: PPTX.
 
static final int e_OCRConvertFormatRTF = 2
 OCR convert format: RTF.
 
static final int e_OCRConvertFormatXLS = 4
 OCR convert format: XLS.
 
static final int e_OCRConvertFormatXLSX = 3
 OCR convert format: XLSX.
 

Detailed Description

This class is used to do OCR for a PDF page or a PDF document. Please ensure OCR engine has been initialized before using this class.

See also
OCREngine

Constructor & Destructor Documentation

◆ OCR()

com.foxit.sdk.addon.ocr.OCR.OCR ( OCR  other)

Constructor, with another ocr object.

Parameters
[in]otherAnother ocr object.

Member Function Documentation

◆ delete()

synchronized void com.foxit.sdk.addon.ocr.OCR.delete ( )

Clean up related resources immediately.

Returns
None.
Note
Once this function is called, current object cannot be used anymore.

◆ getOCRSuspectsInfo()

OCRSuspectInfoArray com.foxit.sdk.addon.ocr.OCR.getOCRSuspectsInfo ( PDFDoc  ocred_pdf_doc) throws com.foxit.sdk.PDFException

Get OCR suspicious information.

The parameter ocred_pdf_doc is a valid PDF document that should have been ocred.

Parameters
[in]ocred_pdf_docA valid PDF document object.
Returns
An array of OCRSuspectInfo objects, If its value is empty, that means the document OCR has no suspicious information.

◆ isEmpty()

boolean com.foxit.sdk.addon.ocr.OCR.isEmpty ( )

Check whether current object is empty or not.

When the current object is empty, that means current object is useless.

Returns
true means current object is empty, while false means not.

◆ oCRConvertTo() [1/2]

void com.foxit.sdk.addon.ocr.OCR.oCRConvertTo ( int  format,
String  src_pdf_path,
String  password,
String  saved_file_path,
Range  page_range,
boolean  is_retain_flowing_text 
) throws com.foxit.sdk.PDFException

OCR the PDF document and convert it to a specified format document.

Parameters
[in]formatThe format of the document to convert. Please refer to values starting from com.foxit.sdk.addon.ocr.OCR.e_OCRConvertFormatDOCX and this should be one of these values.
[in]src_pdf_pathThe source PDF file path.This should not be an empty string.
[in]passwordThe password of the source PDF file. If the PDF file is not encrypted, this should be an empty string.
[in]saved_file_pathThe path of the file to save. This should not be an empty string.
[in]page_rangeThe range of pages that need to be converted. If this is an empty range, that means to convert each page of the PDF document.
[in]is_retain_flowing_texttrue means the generated document will retain flowing text, the text may be reformatted and page breaks cannot be guaranteed to be retained. false means the generated document will retain original page layout.
This parameter is only useful for the following format types:
com.foxit.sdk.addon.ocr.OCR.e_OCRConvertFormatRTF , com.foxit.sdk.addon.ocr.OCR.e_OCRConvertFormatDOC , com.foxit.sdk.addon.ocr.OCR.e_OCRConvertFormatDOCX .
Default value: true.
Returns
None.

◆ oCRConvertTo() [2/2]

void com.foxit.sdk.addon.ocr.OCR.oCRConvertTo ( int  format,
String  src_pdf_path,
String  password,
String  saved_file_path,
Range  page_range,
boolean  is_retain_flowing_text,
OCRConfig  config 
) throws com.foxit.sdk.PDFException

OCR the PDF document and convert it to a specified format document.

Parameters
[in]formatThe format of the document to convert. Please refer to values starting from com.foxit.sdk.addon.ocr.OCR.e_OCRConvertFormatDOCX and this should be one of these values.
[in]src_pdf_pathThe source PDF file path.This should not be an empty string.
[in]passwordThe password of the source PDF file. If the PDF file is not encrypted, this should be an empty string.
[in]saved_file_pathThe path of the file to save. This should not be an empty string.
[in]page_rangeThe range of pages that need to be converted. If this is an empty range, that means to convert each page of the PDF document.
[in]is_retain_flowing_texttrue means the generated document will retain flowing text, the text may be reformatted and page breaks cannot be guaranteed to be retained. false means the generated document will retain original page layout.
This parameter is only useful for the following format types:
com.foxit.sdk.addon.ocr.OCR.e_OCRConvertFormatRTF , com.foxit.sdk.addon.ocr.OCR.e_OCRConvertFormatDOC , com.foxit.sdk.addon.ocr.OCR.e_OCRConvertFormatDOCX .
Default value: true.
[in]configThe OCRConfig object.
Returns
None.

◆ oCRPDFDocument() [1/2]

void com.foxit.sdk.addon.ocr.OCR.oCRPDFDocument ( PDFDoc  pdf_doc,
boolean  is_editable 
) throws com.foxit.sdk.PDFException

OCR each page of a PDF document.

After this function succeeds, the PDF page content may be changed. It is better to parse or re-parse PDF pages in the input PDF document before using these pages.

Parameters
[in]pdf_docA valid PDF document object.
[in]is_editabletrue means the OCR result is editable. false means the OCR result can only be searched but not be edited.
Returns
None.

◆ oCRPDFDocument() [2/2]

void com.foxit.sdk.addon.ocr.OCR.oCRPDFDocument ( PDFDoc  pdf_doc,
boolean  is_editable,
OCRConfig  config 
) throws com.foxit.sdk.PDFException

OCR each page of a PDF document.

After this function succeeds, the PDF page content may be changed. It is better to parse or re-parse PDF pages in the input PDF document before using these pages.

Parameters
[in]pdf_docA valid PDF document object.
[in]is_editabletrue means the OCR result is editable. false means the OCR result can only be searched but not be edited.
[in]configThe OCRConfig .
Returns
None.

◆ oCRPDFDocuments()

void com.foxit.sdk.addon.ocr.OCR.oCRPDFDocuments ( OCRSettingDataArray  settingdata_array) throws com.foxit.sdk.PDFException

OCR multiple pages of multiple PDF documents.

This function can be used to batch process multiple documents or pages. Users can set documents and page ranges via OCRSettingDataArray . The time performance of this function will be better than calling OCR.oCRPDFDocument or OCR.oCRPDFPage multiple times when dealing with a large number of documents or pages. After successful execution, the page content may be changed, it is better to parse or re-parse the PDF pages before using these pages.

Parameters
[in]settingdata_arrayAn array of OCRSettingData objects, if the parameter page_range of OCRSettingData object is empty, that means OCR each page of the PDF document.
Returns
None.

◆ oCRPDFPage() [1/2]

void com.foxit.sdk.addon.ocr.OCR.oCRPDFPage ( PDFPage  pdf_page,
boolean  is_editable 
) throws com.foxit.sdk.PDFException

OCR a PDF page.

After this function succeeds, the PDF page content may be changed and the input PDF page is recommended to be re- parsed.

Parameters
[in]pdf_pageA valid PDF page object. This PDF page should have been parsed.
[in]is_editabletrue means the OCR result is editable. false means the OCR result can only be searched but not be edit.
Returns
None.

◆ oCRPDFPage() [2/2]

void com.foxit.sdk.addon.ocr.OCR.oCRPDFPage ( PDFPage  pdf_page,
boolean  is_editable,
OCRConfig  config 
) throws com.foxit.sdk.PDFException

OCR a PDF page.

After this function succeeds, the PDF page content may be changed and the input PDF page is recommended to be re- parsed.

Parameters
[in]pdf_pageA vaild PDF page object. This PDF page should have been parsed.
[in]is_editabletrue means the OCR result is editable. false means the OCR result can only be searched but not be edited.
[in]configThe OCRConfig .
Returns
None.