OpenRoss - fast, scalable, on-demand image resizer

by Carl Ellis on Monday, 23 Jun 2014

The OpenRoss image proxy service provides a way of serving dynamically resized images from Amazon S3 in a way that is fast, efficient, and auto-scales with traffic.

We have hosted the source for this service at Github.

Motivation

At Lyst we scrape, and have scraped, millions of products that all have at least one image. In our infancy, we saved all product images with 10 preset sizes, and then rendered the image which was nearest in size to what we required. As we grew, this solution became unwieldy for the levels of traffic we were experiencing and nor was it appropriate for our mobile app.

To address this, we created our imaging service which generates a new resized image on the fly when we need it and called it BobRoss after the visionary painter. Images are then cached in CloudFront, effectively meaning that Bobross only paints his subjects once, with new images added for different dimensions and, in the future, effects.

Since we rolled BobRoss out into production on an auto-scaling Amazon cluster, we have decreased page load times and lowered our bandwidth usage by ensuring we only serve images of the exact size.

As this was so useful for us we have decided to open source our solution as OpenRoss.

Service architecture

OpenRoss is a Twisted plugin which uses a pipeline design to handle its requests. OpenRoss is combined with Amazon CloudFront, S3, nginx and local storage to provide very fast caching abilities. OpenRoss fits between CloudFront's cache and S3's content in order to provide ondemand resizing for your services. This means that you only need to store full size images in S3, rather than every size permutation your site requires. When a new size is added to your site, it is only requested once per CloudFront region, and then stored in CloudFront.

Figure 1. OpenRoss Service Architecture
Figure 1. OpenRoss Service Architecture

OpenRoss also uses an on-disk cache, which stores full sized images it has just downloaded from S3 and their resized counterparts. This means that if a new CloudFront region requests a file already resized, it is served immediately, or if a new size is requested for a cached image then S3 can be avoided completely.

Pipeline design

OpenRoss follows a pipeline of operations for each request, which is described in the following figure.

Figure 2. OpenRoss Pipeline Design
Figure 2. OpenRoss Pipeline Design

The pipeline is a simple check, download, operate, and serve process. There are some fun additions though.

Scheduled S3 retries

Downloading media from an exterior service never goes to plan 100% of the time, which is why for every request to S3 we have a number of prescheduled retries and a scheduled failure of the requests all take too long. We do this using task.deferLater to queue S3 GET requests at predefined intervals. By default, we perform 3 attempts at getting media from S3 with a timeout of 200ms for the latest request. This means that we get up to 600ms for the first request to finish, 400ms for the second, and 200ms for the third request to finish and return data. As soon as any request returns data, the rest of the tasks are cancelled and the returned data is pushed through the rest of the pipeline. If all attempts fail, the service returns a failure and can be retried from the client side.

Image modes

OpenRoss supports different image modes for resizing and has 3 modes bundled with the package. These are: resize, resize-composite, and crop.

  • Resize just does a simple box resize, and returns images that are at max the size you requested.
  • Resize-composite does the same as resize, but composites the image onto a white background that is the size you have requested.
  • Crop resizes the image so one dimension is in the desired size, and crops the rest of the image.
Figure 3. Different Image Modes
Figure 3. Different Image Modes

Performance

When running on EC2 instances, we get some pretty nice performance metrics for OpenRoss. We can run 8 instances of OpenRoss on the c3.2xlarge EC2 machines that we use for image delivery. Each one of these instances can handle up to 150 request a second (see the following graph for the values across 24 hours).

Figure 4. Requests per second
Figure 4. Requests per second

Internally, bobross has very consistent timings for each pipeline process, with the S3 downloader being the process which takes longest. See the final graph for a 24 hour trace.

Figure 5. Pipeline Timings
Figure 5. Pipeline Timings

Summary

We made OpenRoss for two reasons, first to make our image resizing architecture scalable and more futureproof, and secondly to experiment with micro-services. We've found that OpenRoss has made life easier for our designers, mobile developers, and our S3 bill. If you would like to contribute to the project, feel free to open up issues or pull requests on the GitHub project page.

Discuss this post on Hacker News.

comments powered by Disqus