AWS Lambda - Quick Notes

AWS Lambda is a fully managed serverless compute service that lets one can run code without provisioning or managing underlying infrastructure.

The following is the summary of the various features/capabilities of Lambda:

Can be used in event-driven architectures to execute code based on triggers
Triggers come from event sources such as the CLI, API, SDK, or other AWS services
Triggers can invoke a piece of code referred to as a Lambda Function
An Event is a JSON-formatted document that contains data for a Lambda function to process
Lambda functions are for short code executions (upto 15 mins)
Support programming languages for Lambda functions - Node.js, Python, Java (8 and above), C# (.Net Core), PowerShell, Golang, Ruby
Once a Lambda function is created using a programming language, one can configure it to respond to events from any event source
An application code must be organized into Lambda functions and the Lambda service runs the Lambda functions on-demand only when needed and scales automatically
Lambda functions execute concurrently (in parallel) based on the incoming events
If one reaches a concurrency limit (based on a Region), then Lambda throttles the execution with a TooManyRequestsException error
One must ensure the Lambda function has the appropriate IAM permissions to perform its task
The Lambda function invocation can be in one of the following modes:

Synchronous
- Can be invoked from CLI, SDK, or API
- The code execution will wait for the function to process the event and return a response
- Error handling on the client side (via retries)
Asynchronous
- Can be invoked through events from S3, SNS, SQS, CloudWatch, etc
- Each event is queued for processing and a response is return immediately on processing
- Lambda will retry up to 3 times on errors
- Processing MUST be idempotent
Event Source Mapping
- Can be invoked through events from SQS, DynamoDB Streams, Kinesis Data Streams, etc
- Lambda POLLS for events from the event sources
- Events gets processed in order (except for SQS Standard)
One pays ONLY for the compute usage (in millisecs) and there is NO charge when the code is NOT running
Pay per request call has the following features:
- First 1M requests are FREE
- $0.20 per 1M requests thereafter
Pay per duration of compute (in increment of 1 ms) has the following features:
- First 400K GB-seconds of compute (with 1GB RAM) time per month FREE
- Pay $1.00 for 600K GB-seconds thereafter
One can provision more memory (RAM) to a Lambda function (upto 10GB)
Increasing RAM implicitly increases the CPU and Network performance and costs more
One can deploy the code and its dependencies for the Lambda function using a deployment package, which can be a zip file archive or a container image
The execution environment provides a secure and isolated runtime environment for the Lambda function
The execution environment manages the processes and resources that are required to run the Lambda function
AWS Lambda integrates with a whole range of AWS services
Monitoring is through CloudWatch Logs
Use-cases: data processing, real-time file processing, real-time stream processing, serverless backends for mobile, web applications

The limits are per Region and as follows:

Execution Limits
- Memory allocation - 128 MB to 10 GB (in 1 MB increments)
- Maximum execution time - 900 seconds (15 mins)
- Size of Environment variables - upto 4 KB
- Disk capacity in the Lambda finction container (access at /tmp) - 512 KB to 10 GB
- Concurrent executions - upto 1000 (can be increased)
- Default timeout - 3 secs
Deployment Limits
- Maximum size of the zip file archive - 50 MB
- Use the /tmp directory to load other files at startup
- Size of Environment variables - upto 4 KB

The following are some features of Lambda SnapStart:

Improves the Lambda function performance up to 10x (at no extra cost) for Java 11 and above
Without SnapStart, a java Lambda function goes through few invocation lifecycle phases: Init -> Invoke -> Shutdown
With SnapStart enabled, a java Lambda function is invoked from a pre-initialized state (no Init from scratch): Invoke -> Shutdown
The Lambda runtime takes a snapshot of the memory and disk state of the initialized function and caches the snapshot for low-latency access

Lambda@Edge and CloudFront Functions

Sometimes applications deployed at the edge have a need to execute some form of logic before reaching the application. These are the customer written code that can be attached to CloudFront and run close to the users to minimize latency.

The following are the two options:

CloudFront Functions
- Are lightweight function written in JavaScript for modifying the viewer request to CloudFront and the viewer response from CloudFront
- For high-scale, latency sensitive CDN customizations
- Sub millisec startup time and handle millions of requests per sec
- Native feature of CloudFront and managed within CloudFront
- Maximum execution time must be less than 1 ms
- Maximum memory allocated is 2 MB
- Useful for cache key transformation (headers, cookies, query strings), header manipulation, URL rewrites or redirects, Validate JWT tokens
Lambda@Edge
- For code written in Node.js or Python and can do much more than just manipulation of the viewer request and viewer response
- Scales to 1000s of requests per sec
- Can manipulate viewer request, origin request, origin response, and viewer response
- Deploy functions in one Region (us-east-1) and CloudFront replicates it to all the other locations
- Maximum execution time must be between 5 to 10 secs
- Maximum memory allocated is in the range of 128 MB to 10 GB

The following are some features of Lambda in VPC:

By default, a Lambda function is launched outside of a customer created VPC (in AWS owned VPC) and therefore cannot access resources in the custom VPC
One can configure a Lambda function to run in a custom VPC by providing the VPC ID, the subnets to use, and attach the Security Groups
If configured run in a custom VPC, it will create an Elastic Network Interface (ENI) in the custom VPC subnets which allows access to other resources in the VPC
One use-case is for using RDS Proxy with Lambda in VPC is to manage too many open connections to the RDS database. Note that RDS Proxy is never publicly accessible and hence the Lambda functions needs to be in the VPC

The following are some features of Lambda from RDS:

Can invoke Lambda function from within RDS database instance
Allows one to process data events from within a database
Supported on RDS for PostgreSQL and Aurora MySQL
Must be configured in the database
Must allow outbound traffic to the Lambda function from within the database (using NAT GW, VPC Endpoint)
The database must have the required permissions to invoke the Lambda function (via IAM Policy or Lambda Resource-based Policy)