Amazon S3 Integration
Amazon provides one of the best cloud storage systems on the planet, through their popular S3 account.
S3 is an unlimited cloud based file storage system that lets you upload and download files securely, while also allowing public access
to files via HTTP and BitTorrent.
You can easily read and write files on the Amazon S3 system from within CFML using the existing FileXXX() functions, or alternatively you can use the optimized direct access functions listed at the bottom of this page.
When you sign-up for Amazon Web Services, you are given two pieces of information that lets you interact with all the web services Amazon provide. This is the AmazonID and the AmazonSecretKey.
When working with Amazon S3 within OpenBD, there are two ways you can address your files on S3. You can specify the accesskey and secretkey in the full URL of the S3 object, or you can register an Amazon datasource and use that symbolic name.
The format of an S3 URL is:
s3://<amazonkey@secretkey>/<s3 bucket>/<file path uri>
s3://<@amazondatasource>/<s3 bucket>/<file path uri>
To register an Amazon datasource you simply make a call to the function AmazonRegisterDataSource(). You only need to register an Amazon datasource once for the duration of the life time of the server. If you do wish to remove a previously registered Amazon datasource, then use AmazonRemoveDataSource()
<cfset AmazonRegisterDataSource( "myamz", "--amazonkey--", "--amazonsecretkey ----" )>
Uploading a file to Amazon S3
Uploading a file to Amazon S3 is the same as if you were copying it from one location to another, but this time you use the S3 URL as the destination.
<cfscript> AmazonRegisterDataSource( "amz", "--amazonkey--", "--amazonsecretkey ----" ); imgFile = FileReadBinary("e:\\tmp\\args.jpg"); FileWrite( "s3://@amz/mybucket/dir1/args.jpg", imgFile ); </cfscript>
Alternatively, you may wish to use AmazonS3Write as an alternative for send files, as this is a more efficient mechanism particularly for files that are very large.
Using AmazonS3Write you get the ability to add in custom meta-data and specify the storage class for the object. Amazon offers a cheaper alternative to its storage mechanism on S3 if you feel your object doesn't need the full redundancy S3 has to offer.
<cfscript> AmazonRegisterDataSource( "amz", "--amazonkey--", "--amazonsecretkey--" ); metadata = { userid : 42, homedir : "my/data/" }; // Standard AmazonS3Write( "amz", "mybucket", "/dir1/args.jpg", metadata ); // Reduced cost AmazonS3Write( "amz", "mybucket", "/dir1/args.jpg", metadata, "REDUCED_REDUNDANCY" ); </cfscript>
You can retrieve your metadata back from a given object using AmazonS3GetInfo
Uploading a file to Amazon S3 in the background
In addition, with AmazonS3Write it will manage the retries of sending the file to Amazon, but also, allow you to background the operations by quickly returning and then once uploaded (or failed after the number of retries) call a CFC with the details.
The following example, will upload the file, 'lageFileToUpload.txt' in the background, attempting up to 3 times, with 10 seconds between each retry. If it succeeds to upload, then the file will be deleted from the file system. If it doesn't succeed, the file will still exist on the file system. The CFC, 'callbackcfc.cfc' will be loaded and the method 'onAmazonS3Write()' will be called. The CFC stub can be seen below.
<cfscript> AmazonRegisterDataSource( "amz", "--amazonkey--", "--amazonsecretkey--" ); AmazonS3Write( datasource="amz", bucket="mybucket", file="/tmp/largeFileToUpload.txt", key="/largeFileToUpload.txt", background=true, retry=3, retrywaitseconds=10, deletefile=true, callback="callbackcfc", callbackdata="ExtraDataToPassToCallbackCFC" ); </cfscript>
The CFC callback stub looks like:
<cfcomponent> <cffunction name="onAmazonS3Write"> <cfargument name="file" type="string"> <cfargument name="success" type="boolean"> <cfargument name="callbackdata" type="string"> <cfargument name="error" type="string"> <!--- do something ---> </cffunction> </cfcomponent>
A new instance of the CFC will be created for each callback, with the application scope being available for the same application that originated the AmazonS3Write() function call.
Downloading a file from Amazon S3
Downloading a file from Amazon S3 is just the same as if you were copying it, you just switch the parameters around.
<cfset AmazonRegisterDataSource( "amz", "--amazonkey--", "--amazonsecretkey ----" )> <cfset imgFile = FileReadBinary("s3://@amz/mybucket/dir1/args.jpg")> <cfset FileWrite( "e:\\tmp\\args.jpg", imgFile )>
Alternatively, you may wish to use AmazonS3Read as an alternative for receiving files, as this is a more efficient mechanism particularly for files that are very large.
Eucalyptus Walrus Operations
Eucalyptus is an open source cloud platform, supporting amongest other things, a full Amazon S3 clone.
OpenBD can operate with a Eucalyptus installation by specifying the local endpoint when creating your Amazon data source. After that, all the AmazonS3 functions operate as normal.
<cfset AmazonRegisterDataSource( "mywalrus", "--walruskey--", "--walrusecretkey ----", "---walrus server--" )>
For more information on setting up your own Amazon S3 installation see Eucalyptus Storage.
Amazon S3 Specific functions
There are functions that let you operate with all of the services provided by Amazon S3.
Function Name | Description |
---|---|
AmazonS3Delete | Deletes the remote file |
AmazonS3GetUrl | Returns back a signed URL that gives people public access to a given file, with an optional expiration date |
AmazonS3GetInfo | Returns back a structure detailing all the headers from a given remote object |
AmazonS3List | Returns all the keys for this bucket |
AmazonS3ListBuckets | Returns all the buckets for this account |
AmazonS3Read | Copies the remote file from Amazon S3 to the local file system |
AmazonS3Rename | Rename the remote file |
AmazonS3SetAcl | Sets the ACL on the given object |
AmazonS3Write | Copies the local file upto Amazon S3 |
Operational notes on Amazon S3
Working with Amazon S3 is fairly straight forward, but there are some operational constraints you must be aware of.
- S3 has no real concept of directories. It stores files against a single key. Therefore operations that operate on a Directory do not work, except for DirectoryList()
- Using AmazonS3List() any key returned that has a slash (/) at the end, this is considered a common prefix, or sub-directory
- Maximum size of a given file is 5GB
- By default, files uploaded to S3 are marked private and not be publicly accessible. Use AmazonS3SetAcl()
- Buckets are limited to 100 per account; Create them using your Amazon AWS console
- S3 access is not available using CFFILE and CFDIRECTORY
- You can easily work with multiple Amazon S3, as well as Eucalyptus Walrus accounts with OpenBD
- Working with objects, you do not need to start your prefix with "/". This will only tell Amazon that you want // as a key marker