tencent cloud

All product documents
APIs
OCR
Last updated: 2024-02-18 15:49:31
OCR
Last updated: 2024-02-18 15:49:31

Feature Description

Based on Tencent Cloud's industry-leading deep learning technology, the optical character recognition (OCR) feature is capable of intelligently recognizing words on images and converting them into editable text. It can be used in photo scanning, paper document digitization, ecommerce ad moderation, and many other scenarios to greatly improve the efficiency of information processing.
This API adopts the sync GET request method.

Billing Description

Calling the API successfully will incur OCR fees and COS read request fees as described in Content Recognition Fees and Request Fees respectively.
If the image files are stored in COS STANDARD_IA storage class, calling the API successfully will incur STANDARD_IA data retrieval fees.
Image processing is not supported for objects stored in the ARCHIVE or DEEP ARCHIVE storage classes. To process these objects, you need to restore them first as instructed in POST Object restore.

Restrictions

Supported image formats: PNG, JPG, JPEG, BMP, PDF.
Image size: The downloaded file cannot exceed 7 MB in size after being Base64-encoded.
Image resolution: A resolution above 600*800 px is recommended; otherwise, the recognition effect may be affected.
Calling the API requires a signature. For more information, see Request Signature.

Request

Sample request

GET /<ObjectKey>?ci-process=OCR&type=general&language-type=zh&ispdf=true&pdf-pagenumber=1&isword=false&enable-word-polygon=false HTTP/1.1
Host: <BucketName-APPID>.cos.<Region>.myqcloud.com
Date: <GMT Date>
Authorization: <Auth String>
Note:
Authorization: Auth String (for more information, see Request Signature).
When this feature is used by a sub-account, relevant permissions must be granted as instructed in Authorization Granularity Details.

Request parameters

Parameter
Description
Type
Required
ObjectKey
Object name, such as folder/document.jpg.
String
Yes
ci-process
CI's processing capability, which is fixed at OCR for OCR.
String
Yes
type
OCR recognition type. Valid values: general (general print recognition), accurate (high-precision print recognition), efficient (simplified print recognition), fast (high-speed print recognition), handwriting (handwriting recognition). Default value: general.
String
No
language-type
Language type. This parameter takes effect only if type is general. The language can be automatically recognized or manually specified. Chinese-English mix (zh) is selected by default. Mixed characters in English and each supported language can be recognized together. For valid values, see Recognizable language types.
String
No
ispdf
Whether to enable PDF recognition. This parameter takes effect only if type is general or fast. Valid values: true, false. Default value: false. After this feature is enabled, images and PDF files can be recognized at the same time.
Boolean
No
pdf-pagenumber
Page number of the PDF page to be recognized. This parameter takes effect only if type is general or fast, the uploaded file is a PDF, and ispdf is true. Only one single PDF page can be recognized. Default value: 1.
Integer
No
isword
Whether to return the word information after recognition. This parameter takes effect only if type is general or accurate. Valid values: true, false. Default value: false.
Boolean
No
enable-word-polygon
Whether to output four-point word coordinates. This parameter takes effect only if type is handwriting. Valid values: true, false. Default value: false.
Boolean
No

Recognizable language types

Value
Language
Value
Language
zh
Chinese-English mix
rus
Russian
zh_rare
English, digits, rare Chinese characters, traditional Chinese characters, special symbols
ita
Italian
auto
Automatic
hol
Dutch
mix
Multi-language
swe
Swedish
jap
Japanese
fin
Finnish
kor
Korean
dan
Danish
spa
Spanish
nor
Norwegian
fre
French
hun
Hungarian
ger
German
tha
Thai
por
Portuguese
hi
Hindi
vie
Vietnamese
ara
Arabic
may
Malay



Request headers

This API only uses common request headers. For more information, see Common Request Headers.

Request body

The request body of this request is empty.

Response

Response headers

This API only returns common response headers. For more information, see Common Response Headers.

Response body

The response body returns application/xml data. The following contains all the nodes:
<Response>
<TextDetections>
<DetectedText></DetectedText>
<Confidence></Confidence>
<Polygon>
<X></X>
<Y></Y>
</Polygon>
<ItemPolygon>
<X></X>
<Y></Y>
<Width></Width>
<Height></Height>
</ItemPolygon>
<Words>
<Confidence></Confidence>
<Character></Character>
<WordCoordPoint>
<WordCoordinate>
<X></X>
<Y></Y>
</WordCoordinate>
</WordCoordPoint>
</Words>
</TextDetections>
<Language></Language>
<Angel></Angel>
<PdfPageSize></PdfPageSize>
<RequestId></RequestId>
</Response>
The nodes are as described below:
Node Name (Keyword)
Parent Node
Description
Type
Response
None
Response container
Container
Response has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
TextDetections
Response
Information of the recognized text, including the text line content, confidence, text line coordinates, and text line coordinates after rotation correction.
Container
Language
Response
Detected language. For more information on the supported languages, see the description of the language-type input parameter.
String
Angel
Response
Image rotation angle in degrees. 0° indicates horizontal text, a positive value indicates clockwise rotation, and a negative value indicates anticlockwise rotation.
Float
PdfPageSize
Response
Total number of PDF pages to be returned if the image is a PDF. Default value: 0.
Integer
RequestId
Response
Unique ID of the request. Each request returns a unique ID. The RequestId is required to troubleshoot issues.
String
TextDetections has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
DetectedText
TextDetections
Recognized text line content.
String
Confidence
TextDetections
Confidence between 0 and 100.
Integer
Polygon
TextDetections
Text line coordinates, which are represented by the coordinates of vertices. Note: This field may return null, indicating that no valid values can be obtained.
Container
ItemPolygon
TextDetections
Pixel coordinates of the text line in the image after rotation correction, which is in the format of (Abscissa of the top-left corner, ordinate of the top-left corner, width, height).
Container
Words
TextDetections
Information of the recognized words (including characters and confidence). Supported recognition types are general and accurate.
Container
WordPolygon
TextDetections
Array of word coordinates, which are represented by the coordinates of four vertices.
Note: This field may return null, indicating that no valid values can be obtained. The supported recognition type is handwriting.
Container
Polygon has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
X
Polygon
Abscissa.
Integer
Y
Polygon
Ordinate.
Integer
ItemPolygon has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
X
ItemPolygon
Abscissa of the top-left corner.
Integer
Y
ItemPolygon
Ordinate of the top-left corner.
Integer
Width
ItemPolygon
Width.
Integer
Height
ItemPolygon
Height.
Integer
Words has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
Confidence
Words
Confidence between 0 and 100.
Integer
Character
Words
Recognized word information.
String
WordCoordPoint
Words
Four-point coordinates of the word in the input image. Supported recognition types are general and accurate.
Container
WordCoordPoint has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
WordCoordinate
WordCoordPoint
Coordinates of the word in the input image, which are represented by the coordinates of four vertices and returned clockwise starting from the top-left vertex.
Container
WordCoordinate has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
X
WordCoordinate
Abscissa.
Integer
Y
WordCoordinate
Ordinate.
Integer
WordPolygon has the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
LeftTop
WordPolygon
Coordinates of the top-left vertex.
Container
RightTop
WordPolygon
Coordinates of the top-right vertex.
Container
RightBottom
WordPolygon
Coordinates of the bottom-right vertex.
Container
LeftBottom
WordPolygon
Coordinates of the bottom-left vertex.
Container
LeftTop, RightTop, RightBottom, and LeftBottom have the following sub-nodes:
Node Name (Keyword)
Parent Node
Description
Type
X
WordCoordinate
Abscissa.
Integer
Y
WordCoordinate
Ordinate.
Integer

Error codes

There are no special error messages for this request. For common error messages, see Error Codes.

Samples

Request

GET /<ObjectKey>?ci-process=OCR&type=general&language-type=zh&ispdf=true&isword=true HTTP/1.1
Authorization: q-sign-algorithm=sha1&q-ak=AKIDZfbOAo7cllgPvF9cXFrJD0a1ICvR****&q-sign-time=1497530202;1497610202&q-key-time=1497530202;1497610202&q-header-list=&q-url-param-list=&q-signature=28e9a4986df11bed0255e97ff90500557e0e****
Host: examplebucket-1250000000.cos.ap-beijing.myqcloud.com

Response

HTTP/1.1 200 OK
Content-Type: application/xml
Content-Length: 414641
Date: Thu, 15 Jun 2017 12:37:29 GMT
Server: tencent-image
x-cos-request-id: NWFjMzQ0MDZfOTBmYTUwXzZkZV8z****

<Response>
<Angel>359.99</Angel>
<Language>mix</Language>
<PdfPageSize>0</PdfPageSize>
<RequestId>NTk0MjdmODlfMjQ4OGY3XzYzYzhf****</RequestId>
<TextDetections>
<Confidence>99</Confidence>
<DetectedText>Hey you</DetectedText>
<ItemPolygon>
<Height>64</Height>
<Width>123</Width>
<X>140</X>
<Y>167</Y>
</ItemPolygon>
<Polygon>
<X>140</X>
<Y>167</Y>
</Polygon>
<Polygon>
<X>263</X>
<Y>167</Y>
</Polygon>
<Polygon>
<X>263</X>
<Y>231</Y>
</Polygon>
<Polygon>
<X>140</X>
<Y>231</Y>
</Polygon>
<Words>
<Character>Hey</Character>
<Confidence>99</Confidence>
<WordCoordPoint>
<WordCoordinate>
<X>212</X>
<Y>167</Y>
</WordCoordinate>
<WordCoordinate>
<X>341</X>
<Y>167</Y>
</WordCoordinate>
<WordCoordinate>
<X>341</X>
<Y>231</Y>
</WordCoordinate>
<WordCoordinate>
<X>212</X>
<Y>231</Y>
</WordCoordinate>
</WordCoordPoint>
</Words>
<Words>
<Character>You</Character>
<Confidence>99</Confidence>
<WordCoordPoint>
<WordCoordinate>
<X>341</X>
<Y>167</Y>
</WordCoordinate>
<WordCoordinate>
<X>263</X>
<Y>167</Y>
</WordCoordinate>
<WordCoordinate>
<X>263</X>
<Y>231</Y>
</WordCoordinate>
<WordCoordinate>
<X>341</X>
<Y>230</Y>
</WordCoordinate>
</WordCoordPoint>
</Words>
</TextDetections>
<TextDetections>
<Confidence>99</Confidence>
<DetectedText>Bye bye</DetectedText>
<ItemPolygon>
<Height>43</Height>
<Width>245</Width>
<X>526</X>
<Y>1444</Y>
</ItemPolygon>
<Polygon>
<X>526</X>
<Y>1444</Y>
</Polygon>
<Polygon>
<X>771</X>
<Y>1444</Y>
</Polygon>
<Polygon>
<X>771</X>
<Y>1487</Y>
</Polygon>
<Polygon>
<X>526</X>
<Y>1487</Y>
</Polygon>
<Words>
<Character>Bye</Character>
<Confidence>99</Confidence>
<WordCoordPoint>
<WordCoordinate>
<X>564</X>
<Y>1444</Y>
</WordCoordinate>
<WordCoordinate>
<X>608</X>
<Y>1444</Y>
</WordCoordinate>
<WordCoordinate>
<X>608</X>
<Y>1487</Y>
</WordCoordinate>
<WordCoordinate>
<X>564</X>
<Y>1487</Y>
</WordCoordinate>
</WordCoordPoint>
</Words>
<Words>
<Character>Bye</Character>
<Confidence>99</Confidence>
<WordCoordPoint>
<WordCoordinate>
<X>608</X>
<Y>1444</Y>
</WordCoordinate>
<WordCoordinate>
<X>641</X>
<Y>1444</Y>
</WordCoordinate>
<WordCoordinate>
<X>641</X>
<Y>1487</Y>
</WordCoordinate>
<WordCoordinate>
<X>608</X>
<Y>1487</Y>
</WordCoordinate>
</WordCoordPoint>
</Words>
</TextDetections>
</Response>

Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support