Amazon S3 Integration

Amazon provides one of the best cloud storage systems on the planet, through their popular S3 account. S3 is an unlimited cloud based file storage system that lets you upload and download files securely, while also allowing public access to files via HTTP and BitTorrent.

You can easily read and write files on the Amazon S3 system from within CFML using the existing FileXXX() functions, or alternatively you can use the optimized direct access functions listed at the bottom of this page.

When you sign-up for Amazon Web Services, you are given two pieces of information that lets you interact with all the web services Amazon provide. This is the AmazonID and the AmazonSecretKey.

When working with Amazon S3 within OpenBD, there are two ways you can address your files on S3. You can specify the accesskey and secretkey in the full URL of the S3 object, or you can register an Amazon datasource and use that symbolic name.

The format of an S3 URL is:

s3://<amazonkey@secretkey>/<s3 bucket>/<file path uri>
s3://<@amazondatasource>/<s3 bucket>/<file path uri>

To register an Amazon datasource you simply make a call to the function AmazonRegisterDataSource(). You only need to register an Amazon datasource once for the duration of the life time of the server. If you do wish to remove a previously registered Amazon datasource, then use AmazonRemoveDataSource()

<cfset AmazonRegisterDataSource( "myamz", "--amazonkey--", "--amazonsecretkey ----" )>

Uploading a file to Amazon S3

Uploading a file to Amazon S3 is the same as if you were copying it from one location to another, but this time you use the S3 URL as the destination.

<cfscript>
AmazonRegisterDataSource( "amz", "--amazonkey--", "--amazonsecretkey ----" );

imgFile = FileReadBinary("e:\\tmp\\args.jpg");
FileWrite( "s3://@amz/mybucket/dir1/args.jpg", imgFile );
</cfscript>

Alternatively, you may wish to use AmazonS3Write as an alternative for send files, as this is a more efficient mechanism particularly for files that are very large.

Using AmazonS3Write you get the ability to add in custom meta-data and specify the storage class for the object. Amazon offers a cheaper alternative to its storage mechanism on S3 if you feel your object doesn't need the full redundancy S3 has to offer.

<cfscript>
AmazonRegisterDataSource( "amz", "--amazonkey--", "--amazonsecretkey--" );

metadata = {
  userid : 42,
  homedir : "my/data/"
  };

// Standard
AmazonS3Write( "amz", "mybucket", "/dir1/args.jpg", metadata );

// Reduced cost
AmazonS3Write( "amz", "mybucket", "/dir1/args.jpg", metadata, "REDUCED_REDUNDANCY" );
</cfscript>

You can retrieve your metadata back from a given object using AmazonS3GetInfo

Uploading a file to Amazon S3 in the background

In addition, with AmazonS3Write it will manage the retries of sending the file to Amazon, but also, allow you to background the operations by quickly returning and then once uploaded (or failed after the number of retries) call a CFC with the details.

The following example, will upload the file, 'lageFileToUpload.txt' in the background, attempting up to 3 times, with 10 seconds between each retry. If it succeeds to upload, then the file will be deleted from the file system. If it doesn't succeed, the file will still exist on the file system. The CFC, 'callbackcfc.cfc' will be loaded and the method 'onAmazonS3Write()' will be called. The CFC stub can be seen below.

<cfscript>
AmazonRegisterDataSource( "amz", "--amazonkey--", "--amazonsecretkey--" );

AmazonS3Write( 	
				datasource="amz", 
				bucket="mybucket", 
				file="/tmp/largeFileToUpload.txt", 
				key="/largeFileToUpload.txt",
				background=true,
				retry=3,
				retrywaitseconds=10,
				deletefile=true,
				callback="callbackcfc",
				callbackdata="ExtraDataToPassToCallbackCFC"
				);
</cfscript>

The CFC callback stub looks like:

<cfcomponent>

	<cffunction name="onAmazonS3Write">
		<cfargument name="file" type="string">
		<cfargument name="success" type="boolean">
		<cfargument name="callbackdata" type="string">
		<cfargument name="error" type="string">
		
		<!--- do something --->
	</cffunction>
	
</cfcomponent>

A new instance of the CFC will be created for each callback, with the application scope being available for the same application that originated the AmazonS3Write() function call.

Downloading a file from Amazon S3

Downloading a file from Amazon S3 is just the same as if you were copying it, you just switch the parameters around.

<cfset AmazonRegisterDataSource( "amz", "--amazonkey--", "--amazonsecretkey ----" )>

<cfset imgFile = FileReadBinary("s3://@amz/mybucket/dir1/args.jpg")>
<cfset FileWrite( "e:\\tmp\\args.jpg", imgFile )>

Alternatively, you may wish to use AmazonS3Read as an alternative for receiving files, as this is a more efficient mechanism particularly for files that are very large.

Eucalyptus Walrus Operations

Eucalyptus is an open source cloud platform, supporting amongest other things, a full Amazon S3 clone.

OpenBD can operate with a Eucalyptus installation by specifying the local endpoint when creating your Amazon data source. After that, all the AmazonS3 functions operate as normal.

<cfset AmazonRegisterDataSource( "mywalrus", "--walruskey--", "--walrusecretkey ----", "---walrus server--" )>

For more information on setting up your own Amazon S3 installation see Eucalyptus Storage.

Amazon S3 Specific functions

There are functions that let you operate with all of the services provided by Amazon S3.

Function Name Description
AmazonS3Delete Deletes the remote file
AmazonS3GetUrl Returns back a signed URL that gives people public access to a given file, with an optional expiration date
AmazonS3GetInfo Returns back a structure detailing all the headers from a given remote object
AmazonS3List Returns all the keys for this bucket
AmazonS3ListBuckets Returns all the buckets for this account
AmazonS3Read Copies the remote file from Amazon S3 to the local file system
AmazonS3Rename Rename the remote file
AmazonS3SetAcl Sets the ACL on the given object
AmazonS3Write Copies the local file upto Amazon S3

Operational notes on Amazon S3

Working with Amazon S3 is fairly straight forward, but there are some operational constraints you must be aware of.

  • S3 has no real concept of directories. It stores files against a single key. Therefore operations that operate on a Directory do not work, except for DirectoryList()
  • Using AmazonS3List() any key returned that has a slash (/) at the end, this is considered a common prefix, or sub-directory
  • Maximum size of a given file is 5GB
  • By default, files uploaded to S3 are marked private and not be publicly accessible. Use AmazonS3SetAcl()
  • Buckets are limited to 100 per account; Create them using your Amazon AWS console
  • S3 access is not available using CFFILE and CFDIRECTORY
  • You can easily work with multiple Amazon S3, as well as Eucalyptus Walrus accounts with OpenBD
  • Working with objects, you do not need to start your prefix with "/". This will only tell Amazon that you want // as a key marker