Select Object Content
) supports only virtual-hosted-style requests, but not path-style requests.cos:GetObject
permission.POST /<ObjectKey>?select&select-type=2 HTTP/1.1Host: <BucketName-APPID>.cos.<Region>.myqcloud.comDate: dateAuthorization: Auth StringRequest body
Host: <BucketName-APPID>.cos.<Region>.myqcloud.com
,<BucketName-APPID> is the bucket name followed by the APPID, such as examplebucket-1250000000
(see Bucket Overview > Basic Information and Bucket Overview > Bucket Naming Conventions), and <Region> is a COS region (see Regions and Access Endpoints).select
and select-type=2
are required, where the former represents a select request and the latter represents the version of this API.<?xml version="1.0" encoding="UTF-8"?><SelectRequest><Expression>Select * from COSObject</Expression><ExpressionType>SQL</ExpressionType><InputSerialization><CompressionType>GZIP</CompressionType><CSV><FileHeaderInfo>IGNORE</FileHeaderInfo><RecordDelimiter>\\n</RecordDelimiter><FieldDelimiter>,</FieldDelimiter><QuoteCharacter>"</QuoteCharacter><QuoteEscapeCharacter>"</QuoteEscapeCharacter><Comments>#</Comments><AllowQuotedRecordDelimiter>FALSE</AllowQuotedRecordDelimiter></CSV></InputSerialization><OutputSerialization><CSV><QuoteFields>ASNEEDED</QuoteFields><RecordDelimiter>\\n</RecordDelimiter><FieldDelimiter>,</FieldDelimiter><QuoteCharacter>"</QuoteCharacter><QuoteEscapeCharacter>"</QuoteEscapeCharacter></CSV></OutputSerialization><RequestProgress><Enabled>FALSE</Enabled></RequestProgress></SelectRequest>
<?xml version="1.0" encoding="UTF-8"?><SelectRequest><Expression>Select * from COSObject</Expression><ExpressionType>SQL</ExpressionType><InputSerialization><CompressionType>GZIP</CompressionType><JSON><Type>DOCUMENT</Type></JSON></InputSerialization><OutputSerialization><JSON><RecordDelimiter>\\n</RecordDelimiter></JSON></OutputSerialization><RequestProgress><Enabled>FALSE</Enabled></RequestProgress></SelectRequest>
<?xml version="1.0" encoding="UTF-8"?><SelectRequest><Expression>Select * from COSObject</Expression><ExpressionType>SQL</ExpressionType><InputSerialization><CompressionType>GZIP</CompressionType><Parquet></Parquet></InputSerialization><OutputSerialization><JSON><RecordDelimiter>\\n</RecordDelimiter></JSON></OutputSerialization><RequestProgress><Enabled>FALSE</Enabled></RequestProgress></SelectRequest>
InputSerialization
(required) specifies the format of the object to extract. It can be set to CSV
, JSON
, or Parquet
.OutputSerialization
specifies the format to save the extraction result. It can be set to CSV
or JSON
.Node Name | Parent Node | Description | Type | Required |
Expression | SelectRequest | An SQL expression for the extraction. For example, SELECT s._1 FROM COSObject s extracts the first column of data from a CSV object. For more information about SQL expressions, please see SELECT Command. | String | Yes |
ExpressionType | SelectRequest | Expression type. This parameter is an extension. Currently, only SQL expressions and parameters are supported. | String | Yes |
InputSerialization | SelectRequest | Format of the object to extract | Container | Yes |
OutputSerialization | SelectRequest | Format to save the extraction result | Container | Yes |
RequestProgress | SelectRequest | Whether to return the query progress ( QueryProgress ). If this feature is enabled, COS Select will return the query progress periodically. | Container | No |
InputSerialization
Node Name | Parent Node | Description | Type | Required |
CompressionType | InputSerialization | Compression type of the object to extract. If the object is not compressed, set this parameter to NONE (default). Otherwise, set it to GZIP or BZIP2 , which are the only two compression types supported by COS Select. | String | No |
CSV/JSON/PARQUET | InputSerialization | Parameters required for each object format. For example, if the object format is CSV, the delimiter needs to be specified. | Container | Yes |
CSV
(a subnode of InputSerialization
)Node Name | Parent Node | Description | Type | Required |
RecordDelimiter | CSV | A delimiter that separates data records of the CSV object into multiple rows. You can set this parameter to any octal character such as a comma (,), semicolon (:), or tab. The delimiter can be up to 2 bytes, which means that delimiters such as \\r\\n are supported. Default value: \\n | String | No |
FieldDelimiter | CSV | A delimiter that separates data records into multiple columns for each row. You can set this parameter to any octal character. This parameter can be up to 1 byte. Default value: , | String | No |
QuoteCharacter | CSV | If a string in the CSV object to extract contains delimiters, you can use this parameter as the escape so that the string will not be interpreted as several parts. The default value is " . For example, if the object contains the string "a, b" , the double quotation marks can prevent the string from being interpreted as two separate characters a and b . | String | No |
QuoteEscapeCharacter | CSV | If the string to extract contains " , use another " as the escape so that the string can be interpreted. For example, the string """ a , b """ will be parsed as " a , b " . Default value: " | String | No |
AllowQuotedRecordDelimiter | CSV | Whether the object contains characters that are identical with the delimiter and need to be escaped with "TRUE : yes (compromises extraction performance) FALSE (default): no | Boolean | No |
FileHeaderInfo | CSV | Whether the object to extract contains a column header. Valid values: NONE : no USE : yes, and you want to use it for extraction (example: SELECT "name" FROM COSObject ) IGNORE : yes, but you don’t want to use it for extraction (You can still use the column index for extraction, for example, SELECT s._1 FROM COSObject s ). | Enum | No |
Comments | CSV | Sets a comment column. This character will be added to the first character of the column. If a column is specified as the comment column, COS Select will not analyze it. Default value: # | String | No |
JSON
(a subnode of InputSerialization
)Node Name | Parent Node | Description | Type | Required |
Type | JSON | JSON type. Valid values: DOCUMENT : The JSON file contains only an independent JSON object that is divided into multiple rowsLINES : Each row in the JSON file contains an independent JSON object. | Enum | Yes |
OutputSerialization
Node Name | Parent Node | Description | Type | Required |
CSV /JSON | OutputSerialization | Format of the extraction result. Valid values: CSV , JSON | Container | Yes (either CSV or JSON) |
CSV
(a subnode of OutputSerialization
)Node Name | Parent Node | Description | Type | Required |
QuoteFields | CSV | Whether to use " as the escape for the extraction result. Valid values: ALWAYS : yes ASNEEDED (default): as needed | String | Yes |
RecordDelimiter | CSV | A delimiter that separates data records of the extraction result into multiple rows. You can set this parameter to any octal character such as a comma (,), semicolon (:), or tab. The delimiter can be up to 2 bytes, which means that delimiters such as \\r\\n are supported. Default value: \\n | String | No |
FieldDelimiter | CSV | A delimiter that separates data records of the extraction result into multiple columns for each row. You can set this parameter to any octal character. This parameter can be up to 1 byte. Default value: , | String | No |
QuoteCharacter | CSV | If a string in the extraction result contains delimiters, you can use this parameter as the escape so that the string will not be interpreted as several parts in subsequent analysis. The default value is " . For example, if the extraction result contains the string a, b , the double quotation marks can prevent the string from being interpreted as two separate characters a and b . Instead, COS Select will save the string as "a, b" . | String | No |
QuoteEscapeCharacter | CSV | If a string in the extraction result contains " , use another " as the escape so that the string can be interpreted. For example, the string " a , b" will be saved as """ a , b """ . Default value: " | String | No |
JSON
(a subnode of OutputSerialization
)Node Name | Parent Node | Description | Type | Required |
RecordDelimiter | JSON | A delimiter that separates data records of the extraction result into multiple rows. You can set this parameter to any octal character such as a comma (,), semicolon (:), or tab. The delimiter can be up to 2 bytes, which means that delimiters such as \\r\\n are supported. Default value: \\n | String | No |
RequestProgress
Node Name | Parent Node | Description | Type | Required |
Enabled | RequestProgress | Whether COS Select should return the query progress periodically. Default value: FALSE | Boolean | No |
<Message 1><Message 2><Message 3>......<Message n>
Total length of a response body = Length of the prelude + length of the prelude CRC code + length of the payload + length of the header(s) + length of the message CRC code
Total length of a response body = Length of the payload + length of the header(s) + 16
Component | Description |
prelude | Records the total length of each message and the total length of all headers separately. Each record is of 4 bytes, and the total length is 8 bytes: total byte-length : the total length of the message, which is encoded in Big Endian and of 4 bytes in total with the capacity of the record itself included. headers byte-length : the total length of all headers, which is encoded in Big Endian and of 4 bytes in total with the capacity occupied by the record excluded. |
prelude CRC | A prelude CRC is encoded in Big Endian and contains a total of 4 bytes. It helps the program quickly determine whether the prelude information is correct so as to reduce blocking during buffering. |
header | Metadata of the extraction result recorded by the message, such as data type and body format. The length of this part in bytes varies by data type. A header is stored as a key-value (KV) pair and encoded in UTF-8. The metadata recorded in a header can be displayed in any order, but each metadata entry is recorded only once. Depending on the data type, the following headers may appear in the result returned by COS Select: MessageType Header : This header represents the response type, where the key is ":message-type" and the value can be "error" or "event". "error" indicates that this record is an error message, and "event" indicates that this record is a specific event. EventType Header : This header records the event type, where the key is ":event-type" and the value can be "Records", "Cont", "Progress", "Stats", or "End". "Records" indicates that the event is the returned extraction record, "Cont" the TCP connection hold, "Progress" the periodically returned extraction result, "Stats" the statistics of the query, and "End" the end of the query. ErrorCode Header : This header records the error code, where the key is ":error-code" and the value can be an error code listed in Special Error Codes. ErrorMessage Header : This header records the error message, where the key is ":error-message" and the value can be an error message returned by the server, which can be used to locate the error. |
Payload | Records the extraction result or official information related to the request. |
Message CRC | CRC code encoded in Big Endian containing a total of 4 bytes. |
Component | Description |
Header Name Byte-Length | Records the length of the header name in bytes |
Header Name | Header type. Value range: ":message-type", ":event-type", ":error-code", ":error-message" ":message-type" indicates that the header records the response type ":event-type" indicates that the header records the event type ":error-code" indicates that the header records the error code " :error-message" indicates that the header records the error message. |
Header Value Type | Type of the header value, which is always 7 for COS Select, indicating that the type is String |
Value String Byte-Length | Length of the header value in bytes, which is always 2 bytes |
Header Value String | Body of the header, i.e., the metadata of the payload, where the length of the header value in bytes depends on the response type |
Response Type | Description |
Records message | An extraction record message, which can contain a single record, a partial record, or multiple records, depending on the number of extraction results. A response body may contain multiple records messages |
Continuation message | A connection continuation message that is sent by COS Select periodically to maintain the TCP connection and appears randomly in the response body. It is recommended to make your client able to automatically identify this type of messages and filter them out so as to avoid smudging the extraction results |
Progress message | A progress message returned by COS Select periodically to indicate the current query progress |
Stats message | A statistics message about the query returned by COS Select after the query ends |
End message | An end message indicating that the query has ended and there is no subsequent response data. The query can be considered to have ended only when a message of this type is received |
RequestLevelError message | Error message including the error causes. This parameter is returned when a COS Select error occurs during the query. If COS Select returns this message, it will not return an end message. |
<?xml version="1.0" encoding="UTF-8"?><Progress><BytesScanned>512</BytesScanned><BytesProcessed>1024</BytesProcessed><BytesReturned>1024</BytesReturned></Progress>
<?xml version="1.0" encoding="UTF-8"?><Stats><BytesScanned>512</BytesScanned><BytesProcessed>1024</BytesProcessed><BytesReturned>1024</BytesReturned></Stats>
Error Code | Error Message | HTTP Status Code |
InvalidXML | The XML is invalid | 400 Bad Request |
MissingRequiredParameter | The SelectRequest entity is missing a required parameter | 400 Bad Request |
MissingExpectedExpression | The SQL expression is missing | 400 Bad Request |
MissingInputSerialization | The input serialization is missing | 400 Bad Request |
InvalidCompressionFormat | The file is not in a supported compression format. Only GZIP and BZIP2 are supported | 400 Bad Request |
MissingInputFormat | The input format is missing | 400 Bad Request |
InvalidFileHeaderInfo | The input FileHeaderInfo is invalid. Only NONE, USE, and IGNORE are supported | 400 Bad Request |
InvalidRequestParameter | The input RecordDelimiter of CSV is invalid | 400 Bad Request |
InvalidRequestParameter | The input FieldDelimiter of CSV is invalid | 400 Bad Request |
InvalidRequestParameter | The input QuoteCharacter of CSV is invalid | 400 Bad Request |
InvalidRequestParameter | The input AllowQuoteRecordDelimiter of CSV is invalid. Only TRUE and FALSE are supported | 400 Bad Request |
InvalidJsonType | The JsonType is invalid. Only DOCUMENT and LINES are supported | 400 Bad Request |
MissingOutputSerialization | The output serialization is missing | 400 Bad Request |
MissingOutputFormat | The output format is missing | 400 Bad Request |
InvalidQuoteFields | The QuoteFields is invalid. Only ALWAYS and ASNEEDED are supported | 400 Bad Request |
InvalidRequestParameter | The output RecordDelimiter of CSV is invalid | 400 Bad Request |
InvalidRequestParameter | The output FieldDelimiter of CSV is invalid | 400 Bad Request |
InvalidRequestParameter | The output QuoteCharacter of CSV is invalid | 400 Bad Request |
InvalidRequestParameter | The output QuoteEscapeCharacter of CSV is invalid | 400 Bad Request |
InvalidRequestParameter | The output RecordDelimiter of JSON is invalid | 400 Bad Request |
SQLParsingError | Encountered an error parsing the SQL expression | 400 Bad Request |
SQLParsingError | Other expressions are not allowed in the SELECT list when '*' is used without dot notation. | 400 Bad Request |
SQLParsingError | The SQL expression contains an empty SELECT | 400 Bad Request |
SQLParsingError | GROUP is not supported in the SQL expression | 400 Bad Request |
SQLParsingError | UNION is not supported in the SQL expression | 400 Bad Request |
SQLParsingError | FROM is missing in the SQL expression | 400 Bad Request |
SQLParsingError | ORDER is not supported in the SQL expression | 400 Bad Request |
SQLParsingError | The column index is invalid in the SQL expression | 400 Bad Request |
SQLParsingError | The table alias is invalid in WHERE | 400 Bad Request |
Bzip2DecompressError | Encountered an error decompressing the bzip2 file | 400 Bad Request |
Bzip2DecompressError | bzip2 is not applicable to the queried object | 400 Bad Request |
GzipDecompressError | Encountered an error decompressing the GZIP file | 400 Bad Request |
GzipDecompressError | GZIP is not applicable to the queried object | 400 Bad Request |
Busy | The service is busy. Please retry later | 400 Bad Request |
Overload | The service is overload. Please retry later | 400 Bad Request |
AmbiguousFieldName | Field name matches to multiple fields in the file | 400 Bad Request |
ComparisonFailed | Attempt to compare failed | 400 Bad Request |
CastFailed | Attempt to convert from one data type to another using CAST failed in the SQL expression. | 400 Bad Request |
OverMaxRecordSize | The length of a record in the input or result is greater than maxCharsPerRecord of 1 MB | 400 Bad Request |
LastRecordParseFail | Please check the last record in the input | 400 Bad Request |
CSVParsingError | Encountered an error parsing the CSV file | 400 Bad Request |
JSONParsingError | Encountered an error parsing the JSON file | 400 Bad Request |
ErrorWritingRow | Encountered an error parsing the SELECT result. Please try again | 400 Bad Request |
InvalidRequestParameter | The input Comment of CSV is invalid | 400 Bad Request |
InvalidTextEncoding | UTF-8 encoding is required. Please check the file and try again. | 400 Bad Request |
NoSuchKey | The specified key does not exist | 404 Not Found |
AccessDenied | Access Denied | 403 Forbidden |
MethodNotAllowed | The specified method is not allowed against this resource | 405 Method Not Allowed |
InternalError | We encountered an internal error. Please try again | 500 Internal Server |
exampleobject.csv
and stored in the examplebucket-1250000000
bucket that resides in the Beijing (ap-beijing) region.POST /exampleobject.csv?select&select-type=2 HTTP/1.1Host: examplebucket-1250000000.cos.ap-beijing.myqcloud.comDate: Tue, 12 Jan 2019 11:49:52 GMTAuthorization: authorization stringContent-Length: content length<?xml version="1.0" encoding="UTF-8"?><SelectRequest><Expression>Select * from COSObject</Expression><ExpressionType>SQL</ExpressionType><InputSerialization><CompressionType>None</CompressionType><CSV><FileHeaderInfo>IGNORE</FileHeaderInfo><RecordDelimiter>\\n</RecordDelimiter><FieldDelimiter>,</FieldDelimiter><QuoteCharacter>"</QuoteCharacter><QuoteEscapeCharacter>"</QuoteEscapeCharacter><Comments>#</Comments></CSV></InputSerialization><OutputSerialization><CSV><QuoteFields>ASNEEDED</QuoteFields><RecordDelimiter>\\n</RecordDelimiter><FieldDelimiter>,</FieldDelimiter><QuoteCharacter>"</QuoteCharacter><QuoteEscapeCharacter>"</QuoteEscapeCharacter></CSV></OutputSerialization></SelectRequest>
Expression
node. For more information about commands, please see SELECT Command. Some common extraction scenarios are described below.s._n
to filter data records in the n
column (the minimum value of n
is 1). The following command filters data records that are greater than 100 in column 3 and returns columns 1 and 2 of these matching records: SELECT s._1, s._2 FROM COSObject s WHERE s._3 > 100
FileHeaderInfo
to Use
), you can use s.name
for indexing. The following command filters records using a header named Id
and FirstName
:SELECT s.Id, s.FirstName FROM COSObject s
SELECT count(*) FROM COSObject s WHERE s._1 < 1
HTTP/1.1 200 OKx-cos-id-2: cos_id_demox-cos-request-id: cos_request_id_demoDate: Tue, 12 Jan 2019 11:50:29 GMTA series of messages
exampleobject.json
and stored in the examplebucket-1250000000
bucket that resides in the Beijing (ap-beijing) region.POST /exampleobject.json?select&select-type=2 HTTP/1.1Host: examplebucket-1250000000.cos.ap-beijing.myqcloud.comDate: Tue, 12 Jan 2019 11:52:29 GMTAuthorization: authorization stringContent-Length: content length<?xml version="1.0" encoding="UTF-8"?><SelectRequest><Expression>Select * from COSObject</Expression><ExpressionType>SQL</ExpressionType><InputSerialization><CompressionType>NONE</CompressionType><JSON><Type>DOCUMENT</Type></JSON></InputSerialization><OutputSerialization><CSV><QuoteFields>ASNEEDED</QuoteFields><RecordDelimiter>\\n</RecordDelimiter><FieldDelimiter>,</FieldDelimiter><QuoteCharacter>"</QuoteCharacter><QuoteEscapeCharacter>"</QuoteEscapeCharacter></CSV></OutputSerialization></SelectRequest>
Expression
node. For more information about commands, please see SELECT Command. Some common extraction scenarios are described below.city
value is Seattle from the object and returns the country
and city
information of these records:SELECT s.country, s.city from COSObject s where s.city = 'Seattle'
SELECT count(*) FROM COSObject s
Range
to specify a part of an object to return.
Was this page helpful?