AWS Elastic Load Balancer (ELB)

The following is the summary of the various features/capabilities of ELB:

Is a managed service with AWS responsible for upgrades and maintenance
Increases the availability and fault tolerance of user applications
Exposes a single point of contact for an application
Seamlessly handle failures of the target Instances
Must select at least two Availability Zones
High Availability across Availability Zones
Enables one to separate the public traffic from the private traffic
Default Security Group allows all incoming traffic via HTTP (80) or HTTPS (443)
Provides support for SSL Termination
Enforce stickiness with cookies
Fixed DNS name - <NAME>.<REGION>.elb.amazonaws.com

ELB supports the following types of Load Balancers:

Application Load Balancer (ALB)

Operates at the Application Layer (Layer 7) of the OSI model
Support protocols HTTP/2, HTTPS, gRPC, WebSocket
Support for redirecting requests from one URL to another
Load balancing across multiple Target Groups (group of EC2 Instances)
Load balancing to multiple applications in the same EC2 Instance (containers)
Support for routing based on URI path
Support for routing based on hostname in the URL
Support for routing based on HTTP headers or query string
Support for Sticky Sessions (Session Affinity)
Health checks HTTP, HTTPS, gRPC

Network Load Balancer (NLB)

Operates at the Transport Layer (Layer 4) of the OSI model
Support protocols TCP, UDP, TLS
Can handles millions of requests per sec
Support for one fixed static IP address (or Elastic IP) per Availability Zone
Support for registering targets by IP address, including targets outside the load balancer
Health checks TCP, HTTP, HTTPS

Gateway Load Balancer (GWLB)

Operates at the Network Layer (Layer 3) of the OSI model
Listens for all IP packets across all ports and routes traffic to specific targets
Used to inetgrate with third party network appliances (ex: firewall, intrusion detection, packet inspection, etc)
Uses the GENEVE protocol on port 6081 to communicate
Health checks TCP, HTTP, HTTPS

Classic Load Balancer (CLB)

Older generation and DEPRECATED
Support protocols HTTP, HTTPS, TCP, SSL
Support for routing based on URL to an application running across Instances
Health checks TCP, SSL, HTTP, HTTPS

EC2 Auto Scaling helps ensure that the correct number of EC2 Instances available to handle the needs of the application.

The following are some of the features:

Create a collection of EC2 Instances called a Auto Scaling Group
One needs to specify the minimum, desired, and maximum number of EC2 Instances
Uses a Launch Template (specifies the Instance configuration) to start new EC2 Instances
Automatically monitors the health and availability of Instances using Health Checks and replaces terminated or impaired Instances
One can specify multiple Availability Zones for high availability and resiliency
One can specify Scaling Policies to launch or terminate Instances based on demand

Scaling is the ability to increase or decrease the compute capacity of the application. Scaling starts with an scaling trigger, which instructs an Auto Scaling Group to either launch or terminate EC2 Instances.

The following are the four types of Scaling Policies:

Manual

One can scale an existing Auto Scaling Group by attaching new Instances
For simple workloads

Scheduled

Set up scaling schedule according to predictable usage patterns
Scaling action sets the desired, minimum, and maximum capacity settings based on a specified the date and time
By default schedules are in Universal Time (UTC), which can be changed to specific time zone
Schedules are per Auto Scale Group

Dynamic

Scales the capacity of the Auto Scaling Group based on changes to the application traffic.

The following are the three types of Dynamic Scaling Policies:

Target Tracking Scaling - Increase or decrease the current capacity of the Auto Scaling Group based on a monitoring metric and a target value
Step Scaling - Increase or decrease the current capacity of the Auto Scaling Group based on an Alarm (Ex: if CPU > 70%, increase by 2 Instance and if CPU < 30, decrease by 2 Instances)
Simple Scaling - Similar to Step Scaling, but with a Cool Down period (Default is 5 mins and no scaling during this period even if there is an Alarm)

Predictive

For situations that have cyclical pattern, such as high usage during regular business hours and low usage during evenings and weekends
Scales capacity in advance based on the forecasted daily and weekly patterns
Uses Machine Learning to predict capacity requirements based on historical usage data

A Sticky Session (also known as Session Affinity) implies that traffic from the same client should be consistently routed to the same target EC2 Instance. This option can be enabled either for the Application Load Balancer or the Classic Load Balancer. Sticky Session can be achieved using a HTTP Cookie, which can be generated either by the Load Balancer or the application.

The following are the two cases:

Load Balancer generated Cookie

Cookie name used is AWSALB
Cookie has an expiry of 7 days and is not configurable

Application generated Cookie

Flexibility to set a custom cookie for client-target stickiness
Cookie name must be registered in the Load Balancer and cannot be the names AWSALB, AWSALBAPP, AWSALBTG
Cookie expiry can be between 1 sec to 7 days
The custom cookie can include any of the cookie attributes required by the application

By default, each Load Balancer instance distributes traffic across the registered targets in its Availability Zone only. If the Cross-Zone Load Balancing option is enabled, then each of the load balancer instances will evenly distribute the traffic across the registered targets in all enabled Availability Zones.

The following is the summary of the features:

For Application Load Balancer, this option is always enabled (cannot be disabled) and no charges will be incurred for the inter Availability Zone data transfers
For Network Load Balancer, this option is disabled by default. If enabled, WILL incur charges for the inter Availability Zone data transfers
For Classic Load Balancer, this option is disabled by default. If enabled, NO charges will be incurred for the inter Availability Zone data transfers