Slimming Down Your AWS Lambda Deployment Package

We use AWS Lambda for a variety of purposes, but one of our most important use cases is rendering customer artwork into PDFs. Around a year ago, we began hitting some limits with AWS Lambda, specifically with the increasing size of the function’s package. What follows is an account of the problem encountered and the solution we ultimately implemented.

As a quick primer, we refer to “rendering” as the stage in our print and mail workflow that normalizes customer artwork for downstream print production, as well as creates proofs and thumbnails for customers to view. These services rely heavily on open source and custom in-house image processing libraries and executable binaries. When we deploy our rendering services, we typically include these dependencies in the deployment package and reference them on-runtime. Over time, as we added more functionality to our rendering services, we’ve needed to add additional dependencies, increasing the size of the package.

As is documented at the time of writing, Lambda functions have a size limit of approximately 250MB for the uncompressed deployment package when deployed from S3. We recognized the risk that this limit posed — it could potentially restrict us from introducing new or upgraded dependencies and could be a significant blocker to implementing time-critical features or bug fixes. We needed to come up with a solution that would give us more breathing room in the near future.

Experimentation

To start, we examined the contents of the deployment package and categorized them. We found that 2.86% of the package was source code, 39.47% were language-specific dependencies (node_modules) and 57.67% were binary executables and system-libraries. Using this data we brainstormed and experimented with several solutions.

The first was to use an executable packer (UPX) to reduce the footprint of our binary executable dependencies, since they made up the majority of our deployment package. This approach decreases the size of that category by around 30% (16% overall), but adds latency to the execution time. Tests demonstrated that for one process-critical executable the execution time increased from 34ms to 277ms.

Alternatively, we considered compressing the deployed Node dependencies using a module bundler to minify the deployed JavaScript code. In practice, this would require significant changes to our application code and would not prevent non-Node dependencies from exceeding our limit.

Store Dependencies in S3, Download on Start-Up

The final approach that we considered, and the one that we ultimately decided to implement, was to store dependencies in S3 and download them on function start-up. The Lambda execution environment provides access to 512 MB of disk space in the /tmp directory (docs) so we have plenty of scaling room for additional dependencies.

This approach has one key drawback – it takes time (~3.5s) to download dependencies, even when archived and compressed. Adding that additional time to each Lambda execution is unacceptable – it would increase our billing significantly.

However, this drawback is mitigated by an interesting feature of Lambda – execution context reuse. Lambda functions essentially execute in sandboxed containers. When a function executes for the first time after being created or having its code or configuration updated, a new container (execution context) is created, and the code is loaded and ran in the container. After the function is executed, Lambda will keep the execution context alive. Under certain conditions, a subsequent execution may re-use the existing execution context.

Re-using an existing context has several key advantages. First, there is overhead associated with starting a new context (language runtimes, DB connections). Depending on the application and the language used, it could take an additional 3 to 4 seconds for a “cold start”. Reusing an existing context obviates the need to set up the language runtimes and other dependencies. Another advantage, and one that we value for our problem, is that the disk storage referenced earlier is also persisted when contexts are re-used.

With this knowledge we formed a basic solution. First, we would exclude the binary executables and system-libraries from the deployment package, reducing the size by ~58%. Then, we would add a pre-deploy script that would:

  1. Generate a dependency version for the deploy (we used the current git commit) and include it in a file in the deployment package. Dependencies are versioned to prevent resource contention during deploys.
  2. Archive and compress the executables and libraries.
  3. Upload the compressed archive to S3, prefixed by the dependency version.

On the start of the Lambda function handler code, we:

  1. Check the dependency version from the file included in the deployment package.
  2. Check the dependencies directory in the current context’s /tmp directory.
    1. If it exists, continue with the handler’s execution.
    2. Otherwise, download and save assets in the /tmp directory from S3, using the dependency version.

Caveats

When working with Lambda execution context reuse, one of the most important questions is “When are contexts not re-used?”. Although context reuse is referenced in the documentation and there is an old blog post on the topic, we did not come across too many AWS-provided details. Fortunately, the developer community has explored this topic in depth (1) (2) (3).

To summarize, subsequent invocations of a Lambda function may re-use the existing context, if:

  • The function code and configuration has not changed.
  • Not too much time has passed since the last invocation. This period has been tested to be between 45-60 minutes, though it should be noted that this is always subject to change.
  • The execution does not exceed the concurrency of previous function invocations. For example, if one function invocation is followed by ten concurrent function invocations, one context may be reused but at least 9 new contexts will be created for the functions to execute concurrently.

The first condition is self explanatory but the latter two are highly dependent on the use-case (i.e. invocation frequency) of the function. We decided to run some experiments to determine how often we would encounter Lambda cold starts, based on the production invocation frequency of our rendering service.

Overall, our experiment showed that for the majority of executions our rendering function would not require a cold start.

Conclusion

We considered a variety of solutions to reducing our deployment package size, but the most effective and scalable solution we found took advantage of AWS Lambda execution context reuse. So far, this system has been running in production for over a year and still is performing as expected. We do, however, recognize that the performance of this solution is highly dependent on execution patterns and use-case.

We hope that this knowledge is helpful! If you’re interested in problems like this, check out our jobs page; we’re hiring!

References to AWS documentation are referenced at the time of writing.