Overview
Errors may occur when data is being transmitted between the client and the server. COS can guarantee the integrity of the uploaded data through MD5 verification. Only when the MD5 checksum received by the COS server is the same as that you set can the data be successfully uploaded.
Each object in COS has a corresponding ETag, which is the information identifier of the object content when the object is created. However, the ETag is not necessarily equivalent to the MD5 checksum of the object content. Therefore, the ETag cannot be used to verify whether the downloaded object is the same as the original object. In this case, you can use custom object metadata (x-cos-meta-*) to verify the object consistency.
Data Verification Methods
Verify an uploaded object
If you need to verify whether the object uploaded to COS is the same as the local object, you can set the Content-MD5 field in the HTTP upload request to the Base64-encoded MD5 checksum of the object content. After that, the COS server will verify the uploaded object. Only when the MD5 checksum received by the COS server is the same as the Content-MD5 value you set can the object be successfully uploaded. Verify a downloaded object
If you need to verify whether the downloaded object is the same as the original object, you can use a verification algorithm to calculate the checksum of the object when it is uploaded, set the checksum of the object through custom metadata, recalculate the checksum of the object after downloading the object, and then verify it against the custom metadata. In this mode, you can choose the verification algorithm as you wish, but for the same object, the algorithm used during upload should be the same as that used during download.
API Samples
Simple Upload Request
Below is a sample request for object upload. When uploading the object, set the Content-MD5 to the Base64-encoded MD5 checksum of the object content and set the custom metadata "x-cos-meta-md5" to the checksum of the object. Only when the MD5 checksum received by the COS server is the same as the Content-MD5 value you set can the object be successfully uploaded.
Note:
In the sample, the checksum of the object is obtained through the MD5 checksum algorithm, and you can choose other algorithms as you wish.
PUT /exampleobject HTTP/1.1
Host: examplebucket-1250000000.cos.ap-beijing.myqcloud.com
Date: Fri, 21 Jun 2019 09:24:28 GMT
Content-Type: image/jpeg
Content-Length: 13
Content-MD5: ti4QvKtVqIJAvZxDbP/c+Q==
Authorization: q-sign-algorithm=sha1&q-ak=AKID8A0fBVtYFrNm02oY1g1JQQF0c3JO****&q-sign-time=1561109068;1561116268&q-key-time=1561109068;1561116268&q-header-list=content-length;content-md5;content-type;date;host&q-url-param-list=&q-signature=998bfc8836fc205d09e455c14e3d7e623bd2****
x-cos-meta-md5: b62e10bcab55a88240bd9c436cffdcf9
Connection: close
[Object Content]
Multipart Upload Request
Below is a sample request to initialize a multipart upload. When uploading object parts, you can set the custom metadata of the object by initializing the multipart upload. Here, set the custom metadata "x-cos-meta-md5" as the checksum of the object.
POST /exampleobject?uploads HTTP/1.1
Host: examplebucket-1250000000.cos.ap-beijing.myqcloud.com
Date: Fri, 21 Jun 2019 09:45:12 GMT
Authorization: q-sign-algorithm=sha1&q-ak=AKID8A0fBVtYFrNm02oY1g1JQQF0c3JO****&q-sign-time=1561109068;1561116268&q-key-time=1561109068;1561116268&q-header-list=content-length;content-md5;content-type;date;host&q-url-param-list=&q-signature=998bfc8836fc205d09e455c14e3d7e623bd2****
x-cos-meta-md5: b62e10bcab55a88240bd9c436cffdcf9
Note:
For files uploaded using multipart upload, COS will verify the MD5 checksum of each part instead of the MD5 checksum of the merged file.
Object download response
Below is a sample response obtained after you send an object download request. You can get the custom metadata "x-cos-meta-md5" of the object from the response and then check it against the recalculated checksum of the object to verify whether the downloaded object is the same as the original object.
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 13
Connection: close
Accept-Ranges: bytes
Cache-Control: max-age=86400
Content-Disposition: attachment; filename=example.jpg
Date: Thu, 04 Jul 2019 11:33:00 GMT
ETag: "b62e10bcab55a88240bd9c436cffdcf9"
Last-Modified: Thu, 04 Jul 2019 11:32:55 GMT
Server: tencent-cos
x-cos-request-id: NWQxZGUzZWNfNjI4NWQ2NF9lMWYyXzk1NjFj****
x-cos-meta-md5: b62e10bcab55a88240bd9c436cffdcf9
[Object Content]
SDK Samples
The following example uses the Python SDK to verify object integrity. The complete sample code is as follows.
Note:
The code is based on Python 2.7. For more information on how to use the Python SDK, see Object Operations. 1. Initialization configuration
Configure user attributes, including SecretId, SecretKey, and region, and create a client object.
from qcloud_cos import CosConfig
from qcloud_cos import CosS3Client
from qcloud_cos import CosServiceError
from qcloud_cos import CosClientError
import sys
import os
import logging
import hashlib
logging.basicConfig(level=logging.INFO, stream=sys.stdout)
secret_id = os.environ['COS_SECRET_ID']
secret_key = os.environ['COS_SECRET_KEY']
region = 'ap-beijing'
token = None
config = CosConfig(Region=region, SecretId=secret_id, SecretKey=secret_key, Token=token)
client = CosS3Client(config)
2. Verify an object uploaded using simple upload
(1) Calculate the checksum of the object
Get the checksum of the object through the MD5 checksum algorithm (you can choose other algorithms as you wish).
object_body = 'hello cos'
md5 = hashlib.md5()
md5.update(object_body)
md5_str = md5.hexdigest()
(2) upload objects using simple upload
EnableMD5=True in the code indicates the enablement of MD5 verification for object upload. The SDK for Python will calculate the Content-MD5. Enabling this will increase the time it takes to upload the object. Only when the MD5 checksum of the object received by the COS server is the same as the Content-MD5 can the object be successfully uploaded.
x-cos-meta-md5 is a custom parameter (in the name format of x-cos-meta-*), which represents the MD5 checksum of the object.
response = client.put_object(
Bucket='examplebucket-1250000000',
Body='hello cos',
Key='example-object-1',
EnableMD5=True,
Metadata={
'x-cos-meta-md5' : md5_str
}
)
print 'ETag: ' + response['ETag']
(3) Download the object
Download the object and get the custom parameter.
response = client.get_object(
Bucket='examplebucket-1250000000',
Key='example-object-1'
)
fp = response['Body'].get_raw_stream()
download_object = fp.read()
print "get object body: " + download_object
print 'ETag: ' + response['ETag']
print 'x-cos-meta-md5: ' + response['x-cos-meta-md5']
(4) Verify the object
After successfully downloading the object, you can recalculate the checksum of the object (the verification algorithm should be the same as that used for upload) and check it against the custom parameter "x-cos-meta-md5" to verify whether the downloaded object is the same as the uploaded object.
md5 = hashlib.md5()
md5.update(download_object)
md5_str = md5.hexdigest()
print 'download object md5: ' + md5_str
if md5_str == response['x-cos-meta-md5']:
print 'MD5 check OK'
else:
print 'MD5 check FAIL'
3. Verify an object uploaded in parts
(1) Calculate the checksum of the object
Simulate object parts and calculate the checksum of the entire object. The MD5 checksum algorithm is used to obtain the checksum of the object in the sample below, and you can choose other algorithms as you wish.
OBJECT_PART_SIZE = 1024 * 1024
OBJECT_TOTAL_SIZE = OBJECT_PART_SIZE * 1 + 123
object_body = '1' * OBJECT_TOTAL_SIZE
md5 = hashlib.md5()
md5.update(object_body)
md5_str = md5.hexdigest()
(2) Initialize the multipart upload
When initializing the multipart upload, set the custom parameter "x-cos-meta-md5" and use the MD5 checksum of the entire object as the parameter value.
response = client.create_multipart_upload(
Bucket='examplebucket-1250000000',
Key='exampleobject-2',
StorageClass='STANDARD',
Metadata={
'x-cos-meta-md5' : md5_str
}
)
upload_id = response['UploadId']
(3) Upload the object in parts
During a multipart upload, an object is divided into multiple (up to 10,000) parts for the upload. The size of each part can range from 1 MB to 5 GB, and the last part can be less than 1 MB. When uploading the parts, you need to set the PartNumber of each part. EnableMD5=True indicates enabling the part check, which increases the time it takes to upload the object. The Python SDK will calculate the Content-MD5 of each part. Only when the MD5 checksum of the object received by the COS server is the same as the Content-MD5 can the parts be successfully uploaded. After the upload succeeds, the ETag of each part will be returned.
part_list = list()
position = 0
left_size = OBJECT_TOTAL_SIZE
part_number = 0
while left_size > 0:
part_number += 1
if left_size >= OBJECT_PART_SIZE:
body = object_body[position:position+OBJECT_PART_SIZE]
else:
body = object_body[position:]
position += OBJECT_PART_SIZE
left_size -= OBJECT_PART_SIZE
response = client.upload_part(
Bucket='examplebucket-1250000000',
Key='exampleobject-2',
Body=body,
PartNumber=part_number,
UploadId=upload_id,
EnableMD5=True
)
etag = response['ETag']
part_list.append({'ETag' : etag, 'PartNumber' : part_number})
print etag + ', ' + str(part_number)
(4) Complete the multipart upload
After all parts are uploaded, you need to complete the multipart upload operation. The ETag and PartNumber of each part should be in one-to-one correspondence which will be used by the COS server to verify the part accuracy. After the multipart upload completes, the returned ETag represents the unique tag value of the merged object but not the MD5 checksum of the entire object content. As a result, you can use the custom parameter to verify the object when downloading it.
response = client.complete_multipart_upload(
Bucket='examplebucket-1250000000',
Key='exampleobject-2',
UploadId=upload_id,
MultipartUpload={
'Part' : part_list
},
)
print "ETag: " + response['ETag']
print "Location: " + response['Location']
print "Key: " + response['Key']
(5) Download the object
Download the object and get the custom parameter.
response = client.get_object(
Bucket='examplebucket-1250000000',
Key='exampleobject-2'
)
print 'ETag: ' + response['ETag']
print 'x-cos-meta-md5: ' + response['x-cos-meta-md5']
(6) Verify the object
After successfully downloading the object, you can recalculate the MD5 checksum of the object and check it against the custom parameter "x-cos-meta-md5" to verify whether the downloaded object is the same as the uploaded object.
fp = response['Body'].get_raw_stream()
DEFAULT_CHUNK_SIZE = 1024*1024
md5 = hashlib.md5()
chunk = fp.read(DEFAULT_CHUNK_SIZE)
while chunk:
md5.update(chunk)
chunk = fp.read(DEFAULT_CHUNK_SIZE)
md5_str = md5.hexdigest()
print 'download object md5: ' + md5_str
if md5_str == response['x-cos-meta-md5']:
print 'MD5 check OK'
else:
print 'MD5 check FAIL'
Was this page helpful?