Foxit PDF SDK
com.foxit.sdk.addon.ocr.OCRConfig Class Reference

Public Member Functions

 OCRConfig ()
 Constructor.
 
 OCRConfig (boolean is_detect_pictures, boolean is_remove_noise, boolean is_correct_skew, boolean is_enable_text_extraction_mode)
 Constructor, with parameters. More...
 
synchronized void delete ()
 Clean up related resources immediately. More...
 
boolean getIs_correct_skew ()
 Get decide whether to enable skew correction. More...
 
boolean getIs_detect_pictures ()
 Get decide whether to detect pictures. More...
 
boolean getIs_enable_text_extraction_mode ()
 Get decide whether to enable text extraction mode. More...
 
boolean getIs_remove_noise ()
 Get decide whether to remove noise of the image of PDF. More...
 
void set (boolean is_detect_pictures, boolean is_remove_noise, boolean is_correct_skew, boolean is_enable_text_extraction_mode)
 Set value. More...
 
void setIs_correct_skew (boolean value)
 Set decide whether to enable skew correction. More...
 
void setIs_detect_pictures (boolean value)
 Set decide whether to detect pictures. More...
 
void setIs_enable_text_extraction_mode (boolean value)
 Set decide whether to enable text extraction mode. More...
 
void setIs_remove_noise (boolean value)
 Set decide whether to remove noise of the image of PDF. More...
 

Detailed Description

This class represents config used for OCR.

Constructor & Destructor Documentation

◆ OCRConfig()

com.foxit.sdk.addon.ocr.OCRConfig.OCRConfig ( boolean  is_detect_pictures,
boolean  is_remove_noise,
boolean  is_correct_skew,
boolean  is_enable_text_extraction_mode 
)

Constructor, with parameters.

Parameters
[in]is_detect_picturesDecide whether to detect pictures.
[in]is_remove_noiseDecide whether to remove noise of the image of PDF.
[in]is_correct_skewDecide whether to enable skew correction.
[in]is_enable_text_extraction_modeDecide whether to enable text extraction mode.

Member Function Documentation

◆ delete()

synchronized void com.foxit.sdk.addon.ocr.OCRConfig.delete ( )

Clean up related resources immediately.

Returns
None.
Note
Once this function is called, current object cannot be used anymore.

◆ getIs_correct_skew()

com.foxit.sdk.addon.ocr.OCRConfig.getIs_correct_skew ( )

Get decide whether to enable skew correction.

Note
Skew can be corrected only for angles not greater than 20 degrees.
Returns
Decide whether to enable skew correction. true means to enable skew correction. false means not to enable skew correction. Default value: true.

◆ getIs_detect_pictures()

com.foxit.sdk.addon.ocr.OCRConfig.getIs_detect_pictures ( )

Get decide whether to detect pictures.

Returns
Decide whether to detect pictures. true means the pictures will be detected during analysis process. false means not to detect the picture, the picture content on the image of PDF document might be interpreted as text. If you would like to extract only text from the image, this option can be set to false. Default value: true.

◆ getIs_enable_text_extraction_mode()

com.foxit.sdk.addon.ocr.OCRConfig.getIs_enable_text_extraction_mode ( )

Get decide whether to enable text extraction mode.

Usually, when some parts of the text are not be found as a text block such as text on a picture or handwriting, it is recommended to set this parameter to true. It is recommended to set this parameter to false in case the complete text of a picture is recognized correctly or the sample contains images or patterns that may be considered and recognized as text. To be short this parameter enables the Engine to recognize everything remotely close to letters as text. true means to enable text extraction mode, while false means not to enable text extraction mode. Default value: false.

Returns
Decide whether to enable text extraction mode.

◆ getIs_remove_noise()

com.foxit.sdk.addon.ocr.OCRConfig.getIs_remove_noise ( )

Get decide whether to remove noise of the image of PDF.

Returns
Decide whether to remove noise of the image of PDF. It can be useful if the image of the PDF contains some noise, such as random black dots or speckles. If the lines of letters on the image are thin, this option should be set to false, otherwise it will affect the recognition of the text. true means the noise in the image will not be recognized during the OCR process. Noise will not be recognized as text. false means not block noise. Default value: true.

◆ set()

void com.foxit.sdk.addon.ocr.OCRConfig.set ( boolean  is_detect_pictures,
boolean  is_remove_noise,
boolean  is_correct_skew,
boolean  is_enable_text_extraction_mode 
)

Set value.

Parameters
[in]is_detect_picturesDecide whether to detect pictures.
[in]is_remove_noiseDecide whether to remove noise of the image of PDF.
[in]is_correct_skewDecide whether to enable skew correction.
[in]is_enable_text_extraction_modeDecide whether to enable text extraction mode.
Returns
None.

◆ setIs_correct_skew()

com.foxit.sdk.addon.ocr.OCRConfig.setIs_correct_skew ( boolean  value)

Set decide whether to enable skew correction.

Note
Skew can be corrected only for angles not greater than 20 degrees.
Parameters
[in]valueDecide whether to enable skew correction. true means to enable skew correction. false means not to enable skew correction. Default value: true.
Returns
None.

◆ setIs_detect_pictures()

com.foxit.sdk.addon.ocr.OCRConfig.setIs_detect_pictures ( boolean  value)

Set decide whether to detect pictures.

Parameters
[in]valueDecide whether to detect pictures. true means the pictures will be detected during analysis process. false means not to detect the picture, the picture content on the image of PDF document might be interpreted as text. If you would like to extract only text from the image, this option can be set to false. Default value: true.
Returns
None.

◆ setIs_enable_text_extraction_mode()

com.foxit.sdk.addon.ocr.OCRConfig.setIs_enable_text_extraction_mode ( boolean  value)

Set decide whether to enable text extraction mode.

Usually, when some parts of the text are not be found as a text block such as text on a picture or handwriting, it is recommended to set this parameter to true. It is recommended to set this parameter to false in case the complete text of a picture is recognized correctly or the sample contains images or patterns that may be considered and recognized as text. To be short this parameter enables the Engine to recognize everything remotely close to letters as text. true means to enable text extraction mode, while false means not to enable text extraction mode. Default value: false.

Parameters
[in]valueDecide whether to enable text extraction mode.
Returns
None.

◆ setIs_remove_noise()

com.foxit.sdk.addon.ocr.OCRConfig.setIs_remove_noise ( boolean  value)

Set decide whether to remove noise of the image of PDF.

Parameters
[in]valueDecide whether to remove noise of the image of PDF. It can be useful if the image of the PDF contains some noise, such as random black dots or speckles. If the lines of letters on the image are thin, this option should be set to false, otherwise it will affect the recognition of the text. true means the noise in the image will not be recognized during the OCR process. Noise will not be recognized as text. false means not block noise. Default value: true.
Returns
None.