Faster CI builds by caching installed bundle to AmazonS3

If you're doing CI (Continuous Integration) you probably noticed that big a chunk of build time goes to the task of installing dependencies with Bundler. If not, just go check, its horrible. I didn't really pay attention to this issue for a long time, but recently it became just unbearable. After creating a new rails app (Rails 4 + Ruby 2.0.0) i noticed that everything is faster. Except one thing. Yep, its Bundler again. Nothing wrong with it, really, it does an awesome job, but time it consumes to download all required gems is just insane, as a size of gems bundle. For clean rails app its goes to aroun 300 mb. And these 300 mb will be downloaded from the internet every time you push your code.

So how can you speed up bundle install ? Easy, by caching installed bundle to an external storage, like Amazon S3. Which is exactly what i did. There are few solutions out there, written in ruby:

Cool, that works. You can use them with Travis CI or whatever CI service you're running. Since im learning Go language, i figured why not just create a utility that you can simply drop into the server and execute. Go applications are compiled so there is no need to install anything additional on the server. I decided to write a dirty-by-working solution (which is not that bad). You can check it out here https://github.com/sosedoff/bundle_cache

bundle_cache

Check source on Github

Here is a quick overview of what it is:

When running a CI build, you have a clean system. That means that you have
to install dependencies with bundler. It takes a lot of time and its very slow.
Especially when you test suite runs for 20 seconds but bundle install runs for
more than 2 minutes. Every single time.

bundle_chache is here to help. It uploads a bundle tarball to Amazon S3, so next
time installation will be faster and consume less traffic. Double kill.

Example: 204.17 seconds bundle install reduced to 15.96 seconds

How it works:

  • Checks if Gemfile.lock exists
  • Creates a SHA1 checksum of lock file
  • Creates a tar.gz archive of .bundle directory
  • Uploads archive to Amazon S3 as https://s3.amazonaws.com/bucket/prefix_sha1.tar.gz

When downloading, it does pretty similar tasks in opposite direction.

You can either build bundle_cache or download a prebuilt binary (Linux/Mac). Check README on github for details.

Using with Travis CI

Travis CI is an awesome CI service, which i use for almost all of my opensource projects. Since bundle_cache requires S3 account, you can encrypt your S3 credentials and add it to your config.

Encrypt S3 credentials:

travis encrypt --add env.global S3_ACCESS_KEY=MYKEY
travis encrypt --add env.global S3_SECRET_KEY=MYSECRET
travis encrypt --add env.global S3_BUCKET=MYBUCKET

Then edit .travis.yml config:

before_install:
  - wget https://s3.amazonaws.com/bundle-cache-builds/bundle_cache
  - chmod +x ./bundle_cache
  - ./bundle_cache download

install:
  - bundle install --deployment --path .bundle

after_script:
  - ./bundle_cache upload

Or if you drop it under bin directory of your rails app, its going to be even shorter:

before_install:
  - ./bin/bundle_cache download

install:
  - bundle install --deployment --path .bundle

after_script:
  - ./bin/bundle_cache upload

You can see the difference (not a big one, but still):

travis build

On bigger apps, where you have a fat Gemfile, bundle install could take a few minutes, so using bundle_cache can simply save you a lot of time (and traffic).