OCR a single file or multiple files





Request Parameters

Parameters Type If Mandatory Description
method string Yes Fixed value: recognize
x-api-key string Yes DDC API Key. When making any API request, pass your key as the value of an x-api-key parameter in the Header.
file_name string No (Either via file_name or in param) Filename returned from the server. If ondup is set as Newcopy, this value might be different from the original name.
language string No Refer to the language support page for details. You can connect two or more languages with a comma (,). If the value is empty, our service will auto detect the language. (Automatic detection of Greek, Cyrillic, Arabic, and Thai is not proposed.)
Example: eng, chs
output_format string No Default value is DOCX. Supported values: DOCX, XLSX, PPTX, ePub, HTML40, RTF2000, UCsv, XML, IOTPDF, UFormattedTxt, Excel2000
output_setting string No Setting for PDF output. Default pdf_version is 1.4, without MRC. The parameter is sent in the POST form data.

{"pdf_version":"pdf1.4 or pdf/a-1a", "pdf_with_mrc": false}
{"epub": "simple or poem"}
{"pdf_converter": "PDF, PDFEdited, PDFImageOnText or PDFImageSubst"}

Note: If you find the result file is too large, please turn on the MRC feature. I.E., "pdf_with_mrc": true. 

MRC: This Mixed Raster Content compression is designed to create compact PDF files from image-only source files containing both texts and color or grayscale pictures or backgrounds, by separating the elements and applying optimized compression to each of them.

zones string No Zonal OCR. You can specify one or more zones.
Note: This parameter does not work if the output format is PDF.
page_range string No E.g., page_range: 6-8
redaction string No The parameter is sent in the POST form data.
This feature is available in the following output formats: word, excel, ppt, html, epub, and pdf

{"text_to_find":"sample", "match_mode": ["wholeWord","matchCase","fuzzyMatch"], "action": "highlight"}
action: "highlight" or "strikeout" or "mark for redact"
recognition_mode string No Supported values include AUTO (default), FASTEST, BALANCED, and MOSTACCURATE.
delete_source_file_if_ocr_success string No true (default) and false
output_file string No By default, the output file uses the same name as the input file, with a different file type.
param string Yes This parameter is required when recognizing multiple files in one request. It is sent in POST form data.

{"list":[{"file_name":"1.jpg"},{" file_name":"2.jpg"}]}


{"list":[{"file_name":"1.jpg", "output_file": "1.pdf"},{" file_name":"2.jpg", "output_file": "2.pdf"}]}


{"list":[{"file_name":"1.jpg"},{" file_name":"2.jpg"}], "output_file":"result.pdf"}

If param and output_file are both sent, param takes place.

On recognizing multiple files in one request, you can now specify the settings separately for each file. Supported parameters include: language, output_format, delete_source_file_if_ocr_success, recognition_mode, zones, output_setting, redaction, page_range.

Response Parameters

Parameters Type UrlEncode Description
input string Yes The filename of input file, or a list of filenames
size int32 No The size of the OCR output file
ctime int32 No The create time of the OCR output file
mtime int32 No The modify time of the OCR output file
md5 String No The md5 of the OCR output file
output string Yes The filename of the OCR output file
error_code int32 No Error code
error_msg string No Error message

Sample Code
Request Body:

x-api-key: z81QwIA3kg0tUXkueP+QN4qXFCP0MIQIqDn94uxoVPdcnnGArr366w== 
POST https://cloud.dynamsoft.com/rest/ocr/v1.1/file?Method=recognize


			'input': ['f1.jpg'],
			'output': 'f1.pdf',
			'size': 372121,
    		'ctime': 1234567890,
		    'mtime': 1234567890,
    		'md5': 'cb123afcc12453543ef'
			'error_code': 0,
            'error_msg': 'Success.',
	   	    'error_code': 0,
    		'error_msg': 'Success.',
Is this page helpful?

Leave a Reply

Your email address will not be published.