the bloggard

Amazon S3 deployment w/ JetS3t and Maven

Posted in Technology by conorpower on March 16, 2010

While playing around with Amazon AWS for a little while last year and running a number of instances on the cloud I didn’t pay much attention to their file storage offering, Amazon S3 (Simple Storage Service) for storing and distributing files on the cloud. All that changed recently when I came across an issue with Google App Engine and a third party library I was using.

The issue was that GAE has a file limitation for applications deployed to the platform of 3,000 files. The third party library being used was Smart GWT and after the GWT compilation step there was much more than 3,000 files in the web application causing the application deployment to fail. Unfortunately there’s no way around this on GAE except to host the files elsewhere, with Amazon S3 being the obvious candidate.

I had been using Maven for build and deployment and now needed to integrate deployment to S3 into the same build process and wanted to share how this was done in this post. The S3 service provides a straightforward ReST based API to manage deployments and there are a growing number of tools available providing a layer of functionality on top of the service. One need only search for “amazon s3 deployment tools” on Google to see the vast number available as browser plugins, windows explorer extensions and command line utilities.

For me, I came across an excellent suite of utilities called JetS3t that can be used to manage your S3 deployments …  The reasons JetS3t over other comparable toolsets are:

  1. Good documentation and elegant design
  2. Active community and support for the tools and ongoing development
  3. Java based
  4. Libraries already available in a Maven repository for dependency resolution

The following snippets of code present the different areas in the pom.xml that need to be modified to support the integration of JetS3t into the build and deployment process:

Configuration Properties

These properties are not required to be here but are here for convenience and to separate the necessary configuration from the other parts of the pom.xml

<properties>
  <aws.s3.bucket.name>bucket-name</aws.s3.bucket.name>
  <aws.access-key>access-key</aws.access-key>
  <aws.secret-key>secret-key</aws.secret-key>
</properties>

The bucket name, access key and secret key can all be retrieved from the AWS management console under the cloudfront and account > security credentials area respectively.

JetS3t Dependencies

The following snippets need to be added to the repositories and dependencies sections in your pom.xml respectively.

<repositories>
  <repository>
    <id>repo1</id>
    <name>Maven Repo1</name>
    <url>http://repo1.maven.org/maven2/</url>
  </repository>
  <!-- the following is only required if the jets3t dependencies cannot be found in the central -->
  <repository>
    <name>jets3t</name>
    <id>jets3t</id>
    <url>http://jets3t.s3.amazonaws.com/maven2</url>
  </repository>
</repositories>

<dependencies>
  <!-- aws s3 mgmt client -->
  <dependency>
    <groupId>net.java.dev.jets3t</groupId>
    <artifactId>jets3t</artifactId>
    <version>0.7.2</version>
  </dependency>
  <!-- synchronize for s3 client -->
  <dependency>
    <groupId>net.java.dev.jets3t</groupId>
    <artifactId>synchronize</artifactId>
    <version>0.7.2</version>
  </dependency>
</dependencies>

Executable Invocation

The Maven exec plugin, configured in the plugins section is used to actually execute the main line Java class called Synchronize to do the actual synchronization.

<!-- plugin for amazon ec2 mgmt -->
<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>exec-maven-plugin</artifactId>
  <version>1.1</version>
  <executions>
    <execution>
      <phase><strong>deploy</strong></phase>
      <goals>
        <goal><strong>java</strong></goal>
      </goals>
    </execution>
  </executions>
  <configuration>
    <includeProjectDependencies>true</includeProjectDependencies>
    <includePluginDependencies>true</includePluginDependencies>
    <classpathScope>runtime</classpathScope>
    <executableDependency>
      <groupId>net.java.dev.jets3t</groupId>
      <artifactId>jets3t</artifactId>
    </executableDependency>
    <!-- main class to invoke synchronization -->
    <mainClass><strong>org.jets3t.apps.synchronize.Synchronize</strong></mainClass>
    <arguments>
      <argument><strong>--properties</strong></argument>
      <argument><strong>${project.build.directory}/${build.finalName}/WEB-INF/classes/aws.s3.synchronize.properties</strong></argument>
      <argument><strong>UP</strong></argument>
      <argument><strong>${aws.s3.bucket.name}</strong></argument>
      <argument><strong>${project.build.directory}/${build.finalName}/path-to-synchronize/</strong></argument>
    </arguments>
  </configuration>
  <dependencies>
    <dependency>
      <groupId>net.java.dev.jets3t</groupId>
      <artifactId>jets3t</artifactId>
      <version>0.7.2</version>
      <type>jar</type>
    </dependency>
  </dependencies>
</plugin>

The areas of specific interest are highlighted in bold in the xml above. Specifically, the invocation of the plugin is bound to the deploy phase with the java goal. Alternatively the synchronization can be invoked manually using the command “mvn exec:java”. The main class to invoke is org.jets3t.apps.synchronize.Synchronize and is available in the jets3t dependency declared in the dependencies section. The parameters passed to this include the properties file dictating the properties to be used during synchronization, the name of the bucket to upload to and the local path to synchronize.

The properties file mentioned above aws.s3.synchronize.properties, is a copy of the property file available with the JetS3t distribution and specifically contains the following properties:

# AWS Access Key (if commented-out, Synchronize will ask at the prompt)
accesskey=${aws.access-key}
# AWS Secret Key <code>(if commented-out, Synchronize will as</code>k at the prompt)
secretkey=${aws.secret-key}

Note that these properties were previously defined in the properties section and will be replaced with the actual values during the maven process-resources phase.

Loose Ends

Ignoring Files

If like my situation you may be storing the files to be synchronized under source control as these form part of your code base and you want to have a revision history of them. Under these circumstances and when you are a source control system such as SVN or CVS where hidden files and directories are stored in the same location as the files you want to synchronize you need to tell JetS3t to ignore these files. This is possible using the “.jets3t-ignore” file, which can be added to the root of the directory being synchronized to specify the files to ignore during the synchronization process. In the case of SVN the file might contain the following entries:

.svn
*.svn
**.svn
**/.svn

Mime Types File

If you’ve gotten this far you will notice that when you run the synchronize command (mvn exec:java) you will notice an error referring to the missing mime.types file on the classpath. This seems to be an innocuous error during the synchronize process required here but may be a longer term issue if you are using other functionality. To resolve the issue, the mime.types file from the JetS3t distribution can also be copied to the src/main/resources folder so that it is also copied to the correct classpath location during the maven process-resources phase.

CloudFront Delay

If in addition to the S3 bucket, you are using a cloud front distribution to the Amazon CDN (Content Delivery Network) there will be a delay between synchronizing your files and when they become available in the actual cloud front distribution. This is worth noting in case you spend a lot of time debugging issue with things not being synchronized properly. To get around this in a DEV or QA environment, it should be sifficient to use the S3 origin bucket domain name rather than the cloud front domain name.

Advertisements
Tagged with: , , , , ,

3 Responses

Subscribe to comments with RSS.

  1. Joni Niemi said, on March 1, 2011 at 8:07 am

    That didn’t work for us, working inside a profile. Inlining the configuration into did the job. On the plus side, there’s no need for project-level dependencies. (Sidenote: as aslight functional difference, the snippet below does the uploading task during install phase, not deploy.)

    org.codehaus.mojo
    exec-maven-plugin

    install2s3
    install

    java

    org.jets3t.apps.synchronize.Synchronize
    false
    true

    net.java.dev.jets3t
    synchronize

    –properties
    ${path.to.aws.properties}
    UP
    ${aws.bucket.and.path}
    ${whatever.to.upload}

    net.java.dev.jets3t
    jets3t
    0.8.0

    net.java.dev.jets3t
    synchronize
    0.8.0

    This can be used from inside a profile, “as is”

  2. Joni Niemi said, on March 1, 2011 at 8:08 am

    Darn, wordpress messed the xml pretty good. 😦

  3. conorpower said, on March 2, 2011 at 2:53 am

    thanks for the update when using profiles


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

%d bloggers like this: