AWS Lambda & Serverless

Chaining Together Lambdas: Exploring All the Different Ways to Link Serverless Functions Together

One fairly common thing people want to do with Lambdas is chain them together to build microservices and workflows. This sounds easy enough in theory, but in practice tends to be much more complex (as is the case with most things in AWS). This post will walk through a few different methods to chain Lambdas together. We'll cover how you can chain together Lambdas using only vanilla Lambda functions, using AWS Step Functions, and using our platform, Refinery.

Chaining Together Vanilla AWS Lambdas

Let’s say you just want to call one Lambda from another Lambda without using any other AWS services.

This can be accomplished by using the calling the invoke() function of the AWS Lambda API (note that your Lambda’s execution role must allow invoking the target Lambda). The following is an example in Python:

<p>CODE:https://gist.github.com/mandatoryprogrammer/8b25e63d70a394f7943e79ec9dd7b7a6.js</p>

However there’s a big gotcha here, specifically the InvocationType you choose. You can choose either Event or RequestResponse and choosing each has different limitations and problems.

What in the world is Event and RequestResponse?!

All Lambdas in AWS can be invoked in one of these two ways. If you’ve used AWS Lambda with another AWS service like API Gateway or SNS you might have used these event types without even realizing it.

The Event (Asynchronous) Invocation Type

The Event invocation is also known/referenced as the asynchronous invocation type. This type is for invocations where you don’t need the return value of the invoked Lambda. Once you do an invoke() call on a Lambda with this invocation type it will be queued up for execution (often executing immediately) and the function call returns immediately. As previously mentioned, this does not allow you to get the result of the Lambda, so if you need to get the return value of a Lambda and do something with it you may want to use the RequestResponse type or chain the next Lambda once more to pass the results somewhere.

There are some important gotchas with this invocation type that are not immediately obvious. 

Some specific gotchas of this invocation type are:

  • You can pass a maximum payload (input) of 256KB. If you have input that is larger, you’ll need to write it somewhere like S3 and read it back out in the code of the next Lambda.
  • If the invoked Lambda results in an uncaught exception, error, or timeout your Lambda will automatically be re-executed up to two more times for a total of three invocations. This can be extremely confusing to experience if you’re not familiar with this behavior, especially if your code mutates some external state. When I was first using AWS Lambda this drove me pretty crazy.

Some AWS services which use this invocation type under the hood are the following:

The RequestResponse (Synchronous) Invocation Type

The RequestResponse invocation is also known as the synchronous invocation type. This invocation type is useful for situations where you need the result/return value of the Lambda. For example, if you have Lambda A which needs to call Lambda B and get its return value before Lambda A completes its execution.

Some specific gotchas of this invocation type are:

  • The Lambda invoking another Lambda via this invocation type will have to wait for the invoked Lambda to finish before it can finish its own execution. This can result in some really sticky situations, for example if Lambda A invokes Lambda B and Lambda B runs for longer than the max-execution time of Lambda A. In this case Lambda A will time out and Lambda B will also encounter an error from the invocator dying in the middle of the invocation. If you’re trying to chain more Lambdas together you’ll have a “waterfall” style effect where the first Lambda will have to be able to keep running past all of the invoked children Lambda(s).
  • You can set an input/payload size of 6MB maximum for this invocation type. If you have input that is larger, you’ll need to write it somewhere like S3 and read it back out in the code of the next Lambda.

Generally speaking, for chaining Lambdas the RequestResponse type can get very hairy very quickly due to having to wait for the child Lambdas to finish executing. You’re likely better off using the Event type if you don’t need the result of the Lambda.

Some AWS services that use the RequestResponse type under the hood are the following:

Having to deal with these subtleties was one of the major reasons we fixed this in our own serverless platform Refinery with transitions. In Refinery, you can just add a transition from one Lambda (Code Block) to another without having to worry about max input size or re-executions. Our abstraction automatically handles it for you, you can read more about how easy it is to do this on our platform below.

Using AWS Step Functions to Chain Lambdas

AWS recognizes that many of its customers want to chain Lambdas together. To fill this need, they created AWS Step Functions.

Step Functions allow you to create workflows of chained together AWS Lambdas by writing AWS’s proprietary states-language. The following is an example of a Step Function diagram:

<p>CODE:https://gist.github.com/mandatoryprogrammer/bb2df69326459d4830515eb0d8b156c2.js</p>

The AWS console provides a visualizer and editor for their JSON which you can use to better understand if your JSON is correct:



This is only the first step. After creating your Step Function in the JSON editor you have to define an IAM role which has the ability to assume role and invoke the Lambda functions. You can make this easy by just letting the console do it for you by selecting “Create new role” on the “Specify details” console page:



You can also click “Review permissions” to get more information about what privileges will be added to the newly-created role.

Once you’ve created your Step Function, you can execute it by going to your Step Machines page in the AWS console, selecting your newly-created Step Function, and clicking the “Start Execution” button.

After your execution has completed you can see a visual representation of your Step Function’s execution steps, this helps you understand how your Step Function execution went and if any problems occurred:


There are some big gotchas you should be aware of when working with Step Functions:

  • The maximum input or return size of data from a Lambda in the workflow is 32,768 characters. This is less than the vanilla Lambda case in both Event and RequestResponse invocation types. To pass data larger than that, generally it’s recommended to write the data to S3 temporarily and having the next Lambda read it out of S3. However, by doing this you lose the ability to do conditional transitions (e.g. a “choice state”) which examine the return data to decide which thing to do next.
  • If you edit your Step Function later on in the console to include new Lambdas or resources your execution IAM role for the Step Function will not be updated. This means it is likely to fail unless you manually update the permissions yourself. This leaves you with always manually updating the IAM role policy permissions each time, or just making the execution role broad.
  • Building Step Function JSON is extremely complex and not really meant for humans to write. This becomes especially true when you’re writing very complex workflows.
  • You can only perform a maximum of 25,000 state transitions in a Step Function. AWS’s recommendation to get around this is to trigger another Step Function after you’ve hit this limit - which isn’t extremely helpful.
  • Step Functions can get expensive when used at scale. After exceeding the 4K monthly free state transitions you are charged $0.025 per 1,000 state transitions.

Having to deal with these subtleties was one of the major reasons we fixed this in our own serverless platform Refinery with transitions. In Refinery, you can just add a transition from one Lambda (Code Block) to another without having to worry about max input size, max state transitions, IAM permissions, or writing any complicated JSON. Our abstraction automatically handles it for you, you can read more about how easy it is to do this on our platform below.

Chaining Together Serverless Functions in Refinery

On our platform Refinery, linking together Lambdas (A.K.A Code Blocks) is a first-class feature. This is implemented as transitions, which allow for chaining together Code Blocks without having to think about maximum input sizes, IAM permissions, max state transitions, or re-executions. You can even use transitions to link to other resources types like a queue (SQS), topic (SNS), or API Endpoint (API Gateway):

Example of different blocks connected together via transitions


Our platform automatically handles all of the underlying data transfer between different Code Blocks and allows you to link together complicated serverless workflows without having to write any additional code. You can even link together blocks in entirely different languages and use Code Blocks published by other Refinery users. The only restriction is that the data being returned should be JSON-serializable.

In addition to the most basic chaining (linked one block to another via a then transition) you can also link blocks together using more complex transitions like the following:

  • If transition: Execute the linked Code Block only if the conditional statement evaluates to true (conditionals are expressed in Python).
  • Else transition: Used with the if transition, only executes the linked Code Block if the if transition fails to evaluate to true
  • Exception transition: Execute the linked Code Block if the proceeding Code Block encounters an uncaught exception. Useful for failure cases in serverless workflows.
  • Fan-Out transition: Executes the linked Code Block once for every item in a returned array of items. For example, if Code Block A returns an array of 100 items, then the Code Block linked to it via a fan-out transition will be invoked 100 times with each item in the array as input to each block.
  • Fan-In transition: Used with the previously-mentioned fan-out transition. This will take the results of all of the Code Blocks invoked by a fan-out and will collect the return values as an array which is passed as input to the next Code Block.
  • Merge transition: The merge transition will merge together the execution results of multiple blocks and pass them as input to the next block once they all finish executing. The image above demonstrates this with two blocks merging together after they have finished executing.

All of this allows you to build complex serverless workflows without having to write any additional code or configuration files. It aims to be a much faster and more easy way of creating serverless applications.

For some example Refinery projects you can check out without having an account, see our Discover page.

Conclusion

We’ve discussed pretty thoroughly how to chain AWS Lambdas together into serverless workflows. If you have any questions or spot any issues with the article, feel free to reach out to us at support(at)refinery.io!

-Matthew Bryant


Interested in using Refinery? Sign up now and get a $5 credit for your first month of usage!

from blog

Related News

Algorithms
The Traveling Pokemaster - Hacking Pokemon Go With a Routing Algorithm

A post about the journey to calculate the optimal Pokestop route for the popular AR game Pokemon Go.

Scraping & Web Automation
Jumping the Rabbit Hole - Walking Around Web App Obfuscation with Request Interception

Since a good number of our customers use our serverless platform to more easily deploy and scale their web bots and scrapers, I thought I’d write a post about a fun scraping challenge I encountered. Solving it required thinking a little bit outside the box I thought I’d share it here since it demonstrates a fairly re-usable approach to scraping heavily-obfuscated sites. This post will dive into how you can use request interception in Puppeteer to beat heavily obfuscated sites that are built to be resistant to scraping.

Security
Turn Simple Proof-of-Concepts (PoCs) into Distributed Serverless Security Scanners with Refinery

In this post we show how to scale up a simple security proof-of-concept test into a highly-concurrent serverless distributed scanner. This allows us to rapidly scan thousands of hosts in seconds and quickly identify vulnerable hosts.

contact us

Stay in Touch

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.