Amazon S3 deployment w/ JetS3t and Maven
While playing around with Amazon AWS for a little while last year and running a number of instances on the cloud I didn’t pay much attention to their file storage offering, Amazon S3 (Simple Storage Service) for storing and distributing files on the cloud. All that changed recently when I came across an issue with Google App Engine and a third party library I was using.
The issue was that GAE has a file limitation for applications deployed to the platform of 3,000 files. The third party library being used was Smart GWT and after the GWT compilation step there was much more than 3,000 files in the web application causing the application deployment to fail. Unfortunately there’s no way around this on GAE except to host the files elsewhere, with Amazon S3 being the obvious candidate.
I had been using Maven for build and deployment and now needed to integrate deployment to S3 into the same build process and wanted to share how this was done in this post. The S3 service provides a straightforward ReST based API to manage deployments and there are a growing number of tools available providing a layer of functionality on top of the service. One need only search for “amazon s3 deployment tools” on Google to see the vast number available as browser plugins, windows explorer extensions and command line utilities.
For me, I came across an excellent suite of utilities called JetS3t that can be used to manage your S3 deployments … The reasons JetS3t over other comparable toolsets are:
- Good documentation and elegant design
- Active community and support for the tools and ongoing development
- Java based
- Libraries already available in a Maven repository for dependency resolution
The following snippets of code present the different areas in the pom.xml that need to be modified to support the integration of JetS3t into the build and deployment process:
Configuration Properties
These properties are not required to be here but are here for convenience and to separate the necessary configuration from the other parts of the pom.xml
<properties> <aws.s3.bucket.name>bucket-name</aws.s3.bucket.name> <aws.access-key>access-key</aws.access-key> <aws.secret-key>secret-key</aws.secret-key> </properties>The bucket name, access key and secret key can all be retrieved from the AWS management console under the cloudfront and account > security credentials area respectively.
JetS3t Dependencies
The following snippets need to be added to the repositories and dependencies sections in your pom.xml respectively.
<repositories> <repository> <id>repo1</id> <name>Maven Repo1</name> <url>http://repo1.maven.org/maven2/</url> </repository> <!-- the following is only required if the jets3t dependencies cannot be found in the central --> <repository> <name>jets3t</name> <id>jets3t</id> <url>http://jets3t.s3.amazonaws.com/maven2</url> </repository> </repositories> <dependencies> <!-- aws s3 mgmt client --> <dependency> <groupId>net.java.dev.jets3t</groupId> <artifactId>jets3t</artifactId> <version>0.7.2</version> </dependency> <!-- synchronize for s3 client --> <dependency> <groupId>net.java.dev.jets3t</groupId> <artifactId>synchronize</artifactId> <version>0.7.2</version> </dependency> </dependencies>Executable Invocation
The Maven exec plugin, configured in the plugins section is used to actually execute the main line Java class called Synchronize to do the actual synchronization.
<!-- plugin for amazon ec2 mgmt --> <plugin> <groupId>org.codehaus.mojo</groupId> <artifactId>exec-maven-plugin</artifactId> <version>1.1</version> <executions> <execution> <phase><strong>deploy</strong></phase> <goals> <goal><strong>java</strong></goal> </goals> </execution> </executions> <configuration> <includeProjectDependencies>true</includeProjectDependencies> <includePluginDependencies>true</includePluginDependencies> <classpathScope>runtime</classpathScope> <executableDependency> <groupId>net.java.dev.jets3t</groupId> <artifactId>jets3t</artifactId> </executableDependency> <!-- main class to invoke synchronization --> <mainClass><strong>org.jets3t.apps.synchronize.Synchronize</strong></mainClass> <arguments> <argument><strong>--properties</strong></argument> <argument><strong>${project.build.directory}/${build.finalName}/WEB-INF/classes/aws.s3.synchronize.properties</strong></argument> <argument><strong>UP</strong></argument> <argument><strong>${aws.s3.bucket.name}</strong></argument> <argument><strong>${project.build.directory}/${build.finalName}/path-to-synchronize/</strong></argument> </arguments> </configuration> <dependencies> <dependency> <groupId>net.java.dev.jets3t</groupId> <artifactId>jets3t</artifactId> <version>0.7.2</version> <type>jar</type> </dependency> </dependencies> </plugin>The areas of specific interest are highlighted in bold in the xml above. Specifically, the invocation of the plugin is bound to the deploy phase with the java goal. Alternatively the synchronization can be invoked manually using the command “mvn exec:java”. The main class to invoke is org.jets3t.apps.synchronize.Synchronize and is available in the jets3t dependency declared in the dependencies section. The parameters passed to this include the properties file dictating the properties to be used during synchronization, the name of the bucket to upload to and the local path to synchronize.
The properties file mentioned above aws.s3.synchronize.properties, is a copy of the property file available with the JetS3t distribution and specifically contains the following properties:
# AWS Access Key (if commented-out, Synchronize will ask at the prompt) accesskey=${aws.access-key} # AWS Secret Key <code>(if commented-out, Synchronize will as</code>k at the prompt) secretkey=${aws.secret-key}Note that these properties were previously defined in the properties section and will be replaced with the actual values during the maven process-resources phase.
Loose Ends
Ignoring Files
If like my situation you may be storing the files to be synchronized under source control as these form part of your code base and you want to have a revision history of them. Under these circumstances and when you are a source control system such as SVN or CVS where hidden files and directories are stored in the same location as the files you want to synchronize you need to tell JetS3t to ignore these files. This is possible using the “.jets3t-ignore” file, which can be added to the root of the directory being synchronized to specify the files to ignore during the synchronization process. In the case of SVN the file might contain the following entries:
.svn *.svn **.svn **/.svnMime Types File
If you’ve gotten this far you will notice that when you run the synchronize command (mvn exec:java) you will notice an error referring to the missing mime.types file on the classpath. This seems to be an innocuous error during the synchronize process required here but may be a longer term issue if you are using other functionality. To resolve the issue, the mime.types file from the JetS3t distribution can also be copied to the src/main/resources folder so that it is also copied to the correct classpath location during the maven process-resources phase.
CloudFront Delay
If in addition to the S3 bucket, you are using a cloud front distribution to the Amazon CDN (Content Delivery Network) there will be a delay between synchronizing your files and when they become available in the actual cloud front distribution. This is worth noting in case you spend a lot of time debugging issue with things not being synchronized properly. To get around this in a DEV or QA environment, it should be sifficient to use the S3 origin bucket domain name rather than the cloud front domain name.
That didn’t work for us, working inside a profile. Inlining the configuration into did the job. On the plus side, there’s no need for project-level dependencies. (Sidenote: as aslight functional difference, the snippet below does the uploading task during install phase, not deploy.)
org.codehaus.mojo
exec-maven-plugin
install2s3
install
java
org.jets3t.apps.synchronize.Synchronize
false
true
net.java.dev.jets3t
synchronize
–properties
${path.to.aws.properties}
UP
${aws.bucket.and.path}
${whatever.to.upload}
net.java.dev.jets3t
jets3t
0.8.0
net.java.dev.jets3t
synchronize
0.8.0
This can be used from inside a profile, “as is”
Darn, wordpress messed the xml pretty good. 😦
thanks for the update when using profiles