Uploading Large Objects In OpenStack
When using cloud storage, such as OpenStack Swift, at some point you might face a situation when you have to upload relatively large files (i.e. a couple of gigabytes or more). For a number of reasons this brings in difficulties for both client and server. From a client point of view, uploading such a large file is quite cumbersome, especially if the connection is not very good. From a storage point of view, handling very large objects is also not trivial.
For that reason, Object Storage (swift) uses segmentation to support the upload of large objects. Using segmentation, uploading a single object is virtually unlimited. The segmentation process works by fragmenting the object, and automatically creating a file that sends the segments together as a single object. This option offers greater upload speed with the possibility of parallel uploads.
Individual objects up to 5 GB in size can be uploaded to OpenStack Object Storage. However, by splitting the objects into segments, the download size of a single object is virtually unlimited. Segments of the larger object are uploaded and a special manifest file is created that, when downloaded, sends all the segments concatenated as a single object. By splitting objects into smaller chunks, you also gain efficiency by allowing parallel uploads.
Log in to a computer or server that has the swift client package installed.
How to do it…
Carry out the following steps to upload large objects, split into smaller segments:
- Creating a 1 GB file under /tmp as an example file to upload. We do this as follows:
dd if=/dev/zero of=/tmp/example-1Gb bs=1M count=1024
- Rather than uploading this file as a single object, we will utilize segmenting to split this into smaller chunks, in this case, 100-MB segments. To do this, we specify the size of the segments with the -s option as follows:
swift -V 2.0 -A http://172.16.0.200:5000/v2.0/ \ -U cookbook:demo -K openstack upload test \ -S 102400000 /tmp/example-1Gb
You will see output similar to the following screenshot showing the status of each upload:
How it works…
OpenStack Object Storage is very good at storing and retrieving large objects. To efficiently do this in our OpenStack Object Storage environment, we have the ability to split large objects into smaller objects with OpenStack Object Storage, maintaining this relationship between the segments and the objects that appear as a single file. This allows us to upload large objects in parallel, rather than stream a single large file. To achieve this, we use the following syntax:
swift -V 2.0 -A http://keystone_server:5000/v2.0 \ -U tenant:user -K password upload container \ -S bytes_to_split large_file
Now, when we list our containers under our account, we have an extra container, named test_segments created, holding the actual segmented data fragments for our file. Our test container holds the view that our large object is a single object. Behind the scenes, the metadata within this single object will pull back the individual objects from the container, to reconstruct the large object.
swift -V 2.0 -A http://172.16.0.200:5000/v2.0/ \ -U cookbook:demo -K openstack list
When the preceding command is executed, we get the following output:
Now, execute the following command:
swift -V 2.0 -A http://172.16.0.200:5000/v2.0/ \ -U cookbook:demo -K openstack list test
The following output is generated: