Addon.OCRPro.Settings

The Settings Class represents an object that holds information to set up the OCR. Here is a code snippet on how to use it

var settings = Dynamsoft.WebTwain.Addon.OCRPro.NewSettings();
settings.Languages = “eng”;
DWObject.Addon.OCRPro.Settings = settings;

The class has the following properties

Settings Class:
RecognitionModule Specifies the module to be used for the OCR.

Module Description
MOSTACCURATE Most Accurate but it also costs most time.
FASTEST Fastest module but not very accurate.
BALANCED This module maintains a balance betwwen accuracy and performance.
AUTO The most suitable module will be automatically used (MOSTACCURATE, BALANCED or FASTEST).
Languages Specifies the language for the OCR. The default value is English (eng). It's also ok to set up multiple languages like "eng,arabic". The supported languages are "ara / arabic; ces / czech; dan / danish; deu / german; ell / greek; eng / english; fra / french; fin / finnish; hun / hungar; ita / italian; nld / duntch; nor / norsk; por / port; pol / polish; rus / russian; swe / swedish; spa / spanish; tur / turkish";
OutputFormat Specifies the result format of the OCR.

Format Description
TXTS Standard text file.
TXTCSV CSV text file.
TXTF Formatted Text file.
XML Simple XML file. The unit in this format is TWIP.
IOTPDF Image over text PDF file. This is the default value.
IOTPDF_MRC Image over text PDF with MRC technology.
PDFVersion Specifies the version of the PDF file should the outputFormat is set to either IOTPDF or IOTPDF_MRC. The version number allowed are 1.0 to 1.7 and by default it is 1.5.
PDFAVersion Specifies the version of the PDF file should the outputFormat is set to either IOTPDF or IOTPDF_MRC. The version number allowed are "pdf/a-1a","pdf/a-1b","pdf/a-2a "," pdf/a-2b "," pdf/a-2u ", " pdf/a-3a ","pdf/a-3b"," pdf/a-3u ".
Redaction Please check the information below
Redaction

The Redaction class is a subClass to the Settings Class. It represents an object that is used to set up the redaction to be done during the OCR. The redaction cannot be exported with Plain Text formatting. Also, redaction only works when OutPutFormat is set to TXTF, IOTPDF or IOTPDF_MRC.
Here is a code snippet on how to use it

var settings = Dynamsoft.WebTwain.Addon.OCRPro.NewSettings();
settings.Languages = “eng”;
settings.Redaction.FindText = “findText”;
DWObject.Addon.OCRPro.Settings = settings;

The class has the following methods

Redaction Class:
FindText Specifies the text to redact.
ReplaceText Specifies the text to replace the found text. Only valid when FindTextAction is set to "MARKFORREDACT". This is not yet supported.
FindTextFlags Specifies how you find the text. The allowed values are 1 (WHOLEWORD), 2 (MATCHCASE), 4 (FUZZYMATCH), 8 (BACKWARD, This is not yet supported.).
FindTextAction Specifies how to treat the found text. The allowed values are 0 (HIGHLIGHT, this places a yellow background behind the found text.), 1 (STRIKEOUT, this strikes out the found text.) or 2 (MARKFORREDACT, this marks the found text for later redaction.). NOTE: the values 0 and 1 are only valid when PDFVersion is set to >= 1.2.

When coding, you can use the following Enums

var EnumDWT_OCRProRecognitionModule= 
{
  OCRPM_AUTO: "AUTO",
  OCRPM_MOSTACCURATE: "MOSTACCURATE",
  OCRPM_BALANCED: "BALANCED",
  OCRPM_FASTEST: "FASTEST"
}
var EnumDWT_OCRProOutputFormat = 
{
  OCRPFT_TXTS: "TXTS",
  OCRPFT_TXTCSV: "TXTCSV",
  OCRPFT_TXTF: "TXTF",
  OCRPFT_XML: "XML",
  OCRPFT_IOTPDF: "IOTPDF",
  OCRPFT_IOTPDF_MRC: "IOTPDF_MRC"
}
var EnumDWT_OCRProPDFVersion = 
{
  OCRPPDFV_0: "1.0",
  OCRPPDFV_1: "1.1",
  OCRPPDFV_2: "1.2",
  OCRPPDFV_3: "1.3",
  OCRPPDFV_4: "1.4", 
  OCRPPDFV_5: "1.5",
  OCRPPDFV_6: "1.6",
  OCRPPDFV_7: "1.7"
}
var EnumDWT_OCRProPDFAVersion = 
{
  OCRPPDFAV_1A: "pdf/a-1a",
  OCRPPDFAV_1B: "pdf/a-1b",
  OCRPPDFAV_2A: "pdf/a-2a",
  OCRPPDFAV_2B: "pdf/a-2b",
  OCRPPDFAV_2U: "pdf/a-2u", 
  OCRPPDFAV_3A: "pdf/a-3a",
  OCRPPDFAV_3B: "pdf/a-3b",
  OCRPPDFAV_3U: "pdf/a-3u"
}
var EnumDWT_OCRFindTextFlags= 
{ 
  OCRFT_WHOLEWORD: 1, 
  OCRFT_MATCHCASE: 2, 
  OCRFT_FUZZYMATCH: 4
}
var EnumDWT_OCRFindTextAction= 
{ 
  OCRFT_HIGHLIGHT: 0, 
  OCRFT_STRIKEOUT: 1, 
  OCRFT_MARKFORREDACT: 2
}