AWS and Lambda: a feedback on the use of Lambda with DynamoDB, RDS and VPC

7 min readApr 18, 2022

Amazon Web Services has more than a third of the market share in the cloud platform sector. It is the preferred tool of major accounts such as Netflix, Atlassian, Ryanair and even NASA. It is also dozens of services adapted to different needs (computing, storage, database, networks, …). As you will have understood, AWS is the trend of the moment, and it is likely that you will one day work with it. I had the opportunity to use some of the tools offered by the AWS platform and especially the Lambda functions. Here is a feedback that will perhaps serve as a reminder, to help you start your project with AWS Lambda with peace of mind.

THE CONTEXT

In 2018, we started an ambitious project with AWS. We put together a team of experienced developers, accompanied by experts and architects, with the goal of building a portal using Amazon Cloud Services exclusively. So we set off on our adventure. We quickly ran into various problems, from which we learned. Here are 3 tips that we can bring to you after this experience.

THE IMPORTANCE OF DESIGNING WITH LAMBDA

A lambda at Amazon is a unit of executable code, autonomous (stateless), and atomic (the smallest there is), for which the developer does not have to worry about the execution environment (serverless). In concrete terms, it is a code file that contains a single function called by Amazon when an external request is made. A lambda can be triggered by an HTTP call (via API Gateway) to create an API point for example. But it can also be called by other Lambda functions, or by other Amazon services. For example by triggering an event from a message queue (SNS / SQS).

Example of serverless application scaling

This principle of Lambda function, which can be instantiated on the fly without taking care of the infrastructure, can seem very attractive, especially in terms of cost. On the other hand, you have to be careful to design the architecture of the application and to think about the breakdown of your code. How will you manage security? What authentication systems will you put in place? How will your lambdas communicate with each other? How will you share code between your lambdas? And to what extent? These questions should not be taken lightly. Try to:

Define your needs;
Define why you are using Lambda (rather than a more traditional monolithic application);
Know how your application will behave if you have peaks of requests, or on the contrary if you have little traffic.

Don’t hesitate to draw a diagram of your application as well as the information transfers between your different application bricks.

💡 In order to give you some ideas for your modeling, I’m sharing some practices that we have implemented (after various experiments).
This is not a typical architecture that works in all cases, simply the architecture that we have implemented with the technical and business constraints that we have. We used the lambda as a REST API point: each API point and each HTTP verb triggers a different lambda. We opted for the Typescript language, transpiled into Javascript, before sending the source code to the Amazon platform. The source code of our lambdas is thus grouped by business set (import of data models, and common services for example), then we import certain libraries common to the whole application. In terms of security and authentication, we based ourselves on the standard API Gateway rules and opted for a model with an identification lambda, which is called by the other API lambdas of the application (themselves invoked by HTTP calls) to find out the user currently connected. Thus, according to the HTTP headers of the request, our identification lambda returns to the API lambda the information about the user (retrieved from a database).

Using the New Lambda Function URLs Instead of API Gateway

Setting up Lambda Function URLs from the console, CLI, and boto3.

aws.plainenglish.io

THE CHOICE OF DATA STORAGE

The vast majority of the applications you will develop need to store and manipulate data. Depending on the type of data you want to manipulate you have several possible technical solutions. My advice on this subject is to analyse your needs, conceptualise your data model and choose your storage system accordingly.

Learning the hard way

During the first phases of development of our application, we were confronted with this problem of choosing a storage system. Following the advice and best practices concerning lambdas, we had oriented ourselves
towards the DynamoDB solution to really measure the impact.

DynamoDB seemed to be the ideal tool: it works in serverless architecture, just like lambdas, so the infrastructure costs are derisory. It provides availability guarantees as well as response times and transfer speed that are more than interesting. It is also able to manage scaling dynamically (with a few parameters) and thus absorb traffic peaks. However, it is not adapted to all needs!

Why Amazon DynamoDB isn’t for everyone

How to decide when it’s right for you

read.acloud.guru

DynamoDB is a non-relational NoSQL database engine. It allows to store and manipulate easily data models, normalised or not, represented by JSON objects.

In our architecture, each business domain, composed of a set of lambdas serving as an API point, is master of its own data. With this in mind, we have created one or more DynamoDB tables per business domain to store this data. The problem comes when it comes to creating joins between these data.

Since our business data model is relational, we needed to define relationships between our different data (from different business domains). Unfortunately, NoSQL databases are not designed for this kind of need. So we were faced with big performance problems. When a lambda needs to aggregate data from several business domains, it must necessarily go to other lambdas to request information from them. So imagine a lambda that retrieves 10 rows. For each one, it must retrieve the joins on two other business models. It will then have to make 20 lambda calls to retrieve the data or query two lambdas and retrieve data groups that must then be manipulated to reformat the data. This cannot work.

You probably shouldn’t use DynamoDB

Edit: This article is firmly out of date now. I have not gone back to try dynamoDB but I have heard they have changed…

syslog.ravelin.com

The call times are then long, the limit of the amount of data returned by DynamoDB queries is quickly reached; you might as well say that it is a real headache. In the case of a relational data model, we therefore advise you to use a classic relational database system (Postgres, Oracle, Mysql), which you can eventually host with Amazon RDS.

BEWARE OF VPC

It is common to have strong security requirements when setting up your application. Especially if you use other services and tools that need to be hosted on an EC2 instance (calculation service, email sending, statefull software, …), you will need to secure the access to these services. Amazon offers the Amazon VPC service for this purpose. This set of network and security configuration tools allows you to create your private subnet, add your resources, and apply all the security features you want (Firewall, fixed IP, SSH secured connection, …). The VPC service is very complete and very advanced.

AWS Lambda’s & VPC cold starts — The dark side 🕶 ⛈

Edit: This article may be less relevant since the time of publishing, AWS has done awesome engineering work like usual…

levelup.gitconnected.com

However, combined with the use of lambdas, it can quickly turn into a nightmare. Indeed, if you place one or more lambda functions inside a VPC, you force Amazon to set up an automatic flow opening mechanism so that it can trigger your lambda: at each instantiation of a lambda, (during the warm-up phase), Amazon creates a network gateway dynamically, allowing to open access to your lambda from outside your VPC. This is not necessarily annoying in itself. However, the creation of this network interface is very time consuming: Amazon takes about 8 seconds to create this network interface. Also, you end up with lambda calls that take (for the first instantiation) 8 to 10 seconds. If your lambda itself calls another one via an HTTP call, you potentially increase the execution time by another 8 seconds.

The warm-up time for lambdas can be accepted and anticipated ergonomically, but if your user has to wait between 15 and 20 seconds to get the answer to his query, he may well have left before getting the result he was hoping for. So think carefully before choosing the Lambda model and the VPC, about the architecture you want. Ask yourself the question: “Is a loading time of several seconds (or several tens of seconds) acceptable for my use case?

Gotcha’s 💣

Overall, our experience with Amazon Web Services has been quite successful. Amazon offers a set of powerful and, for the most part, well thought-out tools. Nevertheless, it is important to understand the technical choices and to make sure that the chosen solutions correspond to a real need. We therefore strongly advise you to take the time to model your architecture and validate the functioning of your vision through a proof of concept (POC).