OCR a single file or multiple files

HTTP REQUEST

POST

URL

https://cloud.dynamsoft.com/rest/ocr/v1/file

Request Parameters

Parameters Type If Mandatory Description
method string Yes Fixed value: recognize
x-api-key string Yes DDC API Key. When making any API request, pass your key as the value of an x-api-key parameter in the Header.
file_name string No (Either via file_name or in param) Filename returned from the server. If ondup is set as Newcopy, this value might be different from the original name.
language string No Refer to the language support page for details. You can connect two or more languages with a comma (,). If the value is empty, our service will auto detect the language. (Automatic detection of Greek, Cyrillic, Arabic, and Thai is not proposed.)
Example: eng, chs
output_format string No Default value is DOCX. Supported values: DOCX, XLSX, PPTX, ePub, HTML40, RTF2000, UCsv, XML, IOTPDF, UFormattedTxt
output_setting string No Setting for PDF output. Default pdf_version is 1.4, with MRC. The parameter is sent in the POST form data.

{"pdf_version":"pdf1.4 or pdf/a-1a", "pdf_with_mrc": true}

{"epub": "simple or poem"}

Note: If you find the conversion speed too slow, please turn off the MRC feature. I.E., "pdf_with_mrc": false. 

MRC: This Mixed Raster Content compression is designed to create compact PDF files from image-only source files containing both texts and color or grayscale pictures or backgrounds, by separating the elements and applying optimized compression to each of them.

redaction string No The parameter is sent in the POST form data.
This feature is available in the following output formats: word, excel, ppt, html, epub, and pdf

{"text_to_find":"sample", "match_mode": ["wholeWord","matchCase","fuzzyMatch"], "action": "highlight"}
action: "highlight" or "strikeout" or "mark for redact"
output_file string No By default, the output file uses the same name as the input file, with a different file type.
param string Yes This parameter is required when recognizing multiple files in one request. It is sent in POST form data.

{"list":[{"file_name":"1.jpg"},{" file_name":"2.jpg"}]}

Or

{"list":[{"file_name":"1.jpg", "output_file": "1.pdf"},{" file_name":"2.jpg", "output_file": "2.pdf"}]}

Or

{"list":[{"file_name":"1.jpg"},{" file_name":"2.jpg"}], "output_file":"result.pdf"}

Note:
If param and output_file are both sent, param takes place.

Response Parameters

Parameters Type UrlEncode Description
input string Yes The filename of input file, or a list of filenames
size int32 No The size of the OCR output file
ctime int32 No The create time of the OCR output file
mtime int32 No The modify time of the OCR output file
md5 String No The md5 of the OCR output file
output string Yes The filename of the OCR output file
error_code int32 No Error code
error_msg string No Error message

Sample Code
Request Body:

x-api-key: z81QwIA3kg0tUXkueP+QN4qXFCP0MIQIqDn94uxoVPdcnnGArr366w== 
POST https://cloud.dynamsoft.com/rest/ocr/v1/file?method=recognize&name=f1.JPG

Response:

{
	[
		{
			'input': ['f1.jpg'],
			'output': 'f1.pdf',
			'size': 372121,
    		'ctime': 1234567890,
		    'mtime': 1234567890,
    		'md5': 'cb123afcc12453543ef'
			'error_code': 0,
            'error_msg': 'Success.',
		}
	],
    		'request_id':3043312669
	   	    'error_code': 0,
    		'error_msg': 'Success.',
}
Is this page helpful?

Leave a Reply

Your email address will not be published.