AWS Application Integration

The following is the summary of the various features/capabilities of Amazon MQ:

Is a managed message broker service that can be used to migrate an existing message broker to the cloud
Provides compatibility with many popular message brokers such as the ActiveMQ and the RabbitMQ
Facilitates the communication between applications and components written in different programming languages
Allows for lift-and-shift of existing on-prem applications using message brokers without the need to manage, operate, or maintain their own messaging system
For high availability within a Region, one will have to deploy an active instance in one availabilty zone and a then a standby instance in another availabilty zone backed by Elastic File System (EFS)

The following is the summary of the various features/capabilities of Amazon SQS:

Offers a secure, durable, and available hosted queue that lets one integrate and decouple distributed software systems and components
A Queue holds messages (data)
Producers send messages to one or more queues
Consumers poll (pull) for messages from one or more queues
There can be only ONE consumer per queue
A consumer may received upto 10 messages at a time from a poll
Messages are retained until processed and EXPLICITLY deleted by consumers
Support for unlimited number of messages in a queue
Unlimited throughput
Low latency to publish or consume (less than 10 ms)
The default retention period for messages is 4 days (to a maximum of 14 days)
Maximum message size of 256 KB
At least once delivery semantics (meaning a consumer can see duplicates)
No message ordering guarantees (best effort ordering)
One can monitor the CloudWatch metric for queue length and dispatch an CloudWatch Alarm if a threshold is breached
Pay only based on the usage
For encryption in-flight, one can use the HTTPS API
For encryption at-rest, it is enabled by default and uses AWS created/managed key
For controlling access to SQS service, one can leverage either the IAM Policies or the SQS Access Policies (managed in SQS)
SQS Access Policies allow for cross-account access
Message Visibility Timeout
- When a message is pulled by a consumer, that specific message becomes invisible to other consumers
- The visibility timeout is what controls the invisible period which is 30 secs by default
- Before the visibility timeout elapses, the consumer must process and delete the message, else the next consumer will see the same message again and result in duplicate processing
- If a consumer processing a message knows it needs more time to process the message, it can prevent duplicate processing by invoking the ChangeMessageVisibility API
Long Polling
- When a consumer requests for messages from a queue by polling, it can optionally request to WAIT for messages to arrive if there are NONE in the queue
- Reduces the number of API calls made to SQS while increasing the efficiency of the application
- The poll wait timeout can be between 1 sec and 20 secs
The default queue is called a Standard Queue
FIFO Queue
- First In First Out (FIFO) queue that guarantees the ordering of messages
- Ordering by Message Group ID (all messages in a group are ordered)
- Limited throughput of 300 messages per sec without batching AND 3000 messages per sec WITH batching of 10 messages
- Exactly once delivery semantics (meaning a consumer will NOT see duplicates using Message Deduplication ID)
A Dead Letter Queue is for error handling. If a consumer cannot process a message due to some error, the message is not deleted and will be received again by the consumer. After receiving the same message a certain number of times, the message can be moved to this queue
Useful when we need an durable and relaible event-driven solution that guarantees the processing of all the messages

The following is the summary of the various features/capabilities of Amazon SNS:

Is a managed service that provides a PUSH based message delivery from publishers to subscribers
Publishers communicate asynchronously with subscribers by sending messages to a Topic
A topic is a logical access point that acts as a communication channel
There can be many subscribers listening on a topic
Each subscriber of a topic will receive all the messages
When a publisher sends a message to a topic, the message is pushed to all the subscribers on that topic
There can be up to 12.5 million subscribers per topic
There can be up to 100 K topics per account
One can subscribe to a topic and receive published messages using a supported endpoint type, such as SQS, Lambda, Kinesis Data Firehose (NOT Kinesis Data Streams), HTTP/HTTPS, email, mobile text messages (SMS), etc
Can receive messages from various AWS services, such as CloudWatch Alarm, S3 Bucket, Lambda, DynamoDB, etc
A subscriber can do message filtering on a topic using a JSON based Filter Policy that defines the filtering criteria for the message
For encryption in-flight, one can use the HTTPS API
For encryption at-rest, it is enabled by default and uses AWS created/managed key
For controlling access to SNS service, one can leverage either the IAM Policies or the SNS Access Policies (managed in SNS)
The published messages are NOT persisted in the topic
Integrates with SQS for FAN-OUT architecture pattern with no data loss
Cross-region message delivery possible from SNS in one Region to SQS in another Region
FIFO Topic
- First In First Out (FIFO) topic that guarantees the ordering of messages
- Ordering by Message Group ID (all messages in a group are ordered)
- Exactly once delivery semantics (using Message Deduplication ID)
- Limited throughput of 300 messages per sec
Useful when we need an event-driven solution that can deliver messages asynchronously to many consumers without any strong delivery guarantees

The following is the summary of the various features/capabilities of Step Functions:

Is a serverless orchestration service that lets one integrate with Lambda functions and other AWS services to build business-critical workflow applications
A Workflow is nothing more than a state machine, where each step in a workflow is called a State
A Task is a state that represents a unit of work that a Lambda function or a AWS service performs
Allows one to build a distributed workflow application as a series of event-driven steps and see the visual workflow via the graphical console
Workflow has features for - sequencial tasks, parallel tasks, branching, conditions, error handling, timeout, human approval, etc
Can integrate with many AWS services (EC2, ECS, API Gateway, SQS, etc) as well as on-prem services
An Execution is an instance of a running workflow to perform the series of tasks
There are two workflow types - Standard Workflows AND Express Workflows
Standard Workflows
- At-most-once workflow execution with retry and can run for up to one year
- Used for long-running, auditable workflows which show execution history and visual debugging
- Can execute up to 2000 per sec
- Can have up to 4000 state transitions
- Cost per state transition
Express Workflows
- At-least-once workflow execution and can run for up to 5 mins
- Used for high-event rate workloads such as streaming data processing
- Can execute up to 100000 per sec
- Can have UNLIMITED state transitions
- Cost per state transition and duration of execution
- Sends execution history to CloudWatch
There are two types of Express Workflows - Asynchronous AND Synchronous
Asynchronous Express Workflows
- Return confirmation once the workflow starts and does not wait for the workflow to complete
- One must poll the workflow to get the result
- Triggered by an event or by calling the StartExecution API
Synchronous Express Workflows
- Start the workflow and wait for the completion and the returned result
- Invoked from the API Gateway, Lambda function, or by calling the StartSyncExecution API

The following is the summary of the various features/capabilities of EventBridge:

Previously was referred to as CloudWatch Events
Is a serverless event bus that routes events to connect application components together, making it easier for one to build scalable event-driven applications
An Event Bus is a router that receive events from Event Sources and delivers them to zero or more Event Targets (Lambda functions, SNS, Kinesis Data Stream, etc)
One can create custom a event bus for sending events from services and applications from an AWS account
Event Sources can be AWS services, custom applications, or third-party SaaS applications
State changes in the event sources are sent as events to the event bus, which are processed by Rules, before routing them to specific event targets
Provides simple and consistent ways to ingest, filter, transform, and deliver events which allows one to build and test applications quickly
Are well-suited for routing events from many sources to many targets, with optional transformation of events prior to delivery to a target
The default event bus (created for an AWS account) receives events from the AWS services
Support for Archives to backup events for future replay during testing
Support for Event Schema Registry

The following is the summary of the various features/capabilities of the API Gateway:

Is a service for creating, publishing, maintaining, monitoring, and securing REST, HTTP, and WebSocket APIs at any scale
Enables stateless HTTP-based client-server communication to the backend HTTP or RESTful API services
Enables stateful full-duplex communication between client and server using the WebSocket protocol
Can be used to expose backend HTTP endpoints, Lambda functions, or other AWS services
Can integrate with Lambda functions (through a Lambda Proxy) to create a fully managed serverless API service
Can handle multiple environments (dev, test, prod)
Can handle security (authentication and authorization)
Support for rate limiting via request throttling. By default, limits the steady-state requests to 10000 requests per sec and maximum concurrent requests of 5000 per sec across all APIs in an AWS account
Support for Swagger/OpenAPI standards to define APIs
Can transform and validate requests and responses
Support for caching API responses (with a TTL)
Support for Usage Plans for the end users using API Keys (allows for custom throttling based on the API keys)
Default timeout is 30 secs
The following are the supported API Gateway Deployment Types:

Edge-Optimized
- The default hostname of an API Gateway that is deployed to the specified Region while using CloudFront to facilitate client access typically from across AWS Regions
- Useful when the clients are global
- Requests come through CloudFront edge locations
- API Gateway still in a single Region
Regional
- The host name of an API that is deployed to the specific Region and intended to serve clients, such as EC2 instances, in the same AWS Region
- Useful when the clients are in a Region
- Can manually combine with CloudFront (for caching and distribution)
Private
- An API endpoint that is exposed through interface VPC endpoints and allows a client to securely access private API resources inside a VPC
- Can only be accessed from a VPC using the interface VPC endpoint (via ENI)
- One needs to configure a resource policy to control access
The following are few points related to API Gateway Security:

User Authentication
- Via IAM Roles - useful for internal applications
- Using AWS Cognito - identity for external users
- Via a Custom Authorizer - for custom logic
Custom Domain Name (HTTPS)
- Must setup a CNAME or an Alias in Route 53
- If using Edge-Optimized endpoint, the certificate MUST be in us-east-1
- If using Regional endpoint, the certificate MUST be the API Gateway in the same Region

The following is the summary of the various features/capabilities of Kinesis Data Streams:

Used for collecting and processing LARGE streams of data records in real-time
One can use it for rapid and continuous data intake and aggregation
Type of data can be infrastructure logs, application logs, IoT feeds, web clickstream, etc
It is a collection of one or more Shards, where each shard is a uniquely identified sequence of data records with a fixed unit of capacity
A Data Record is the unit of data stored and is composed of a sequence number, a partition key, and a data blob (immutable sequence of bytes)
Data records with the same partition key go into the SAME shard, ensuring ordering within a shard
The total capacity of a data stream is the sum of the capacities of its shards in terms of ingestion and consumption rates
Producers send data records into a data stream
Each producer can send data records at a rate of up to 1000 records per sec or 1 MB per sec into a shard
There can be MANY consumers (processing concurrently) receiving the data records from a data stream
The following are two modes of data consumption:
- Standard - Consumer have to pull data
- Enhanced Fan-out - Data is pushed to consumers
Consumers can receive data records at a rate of up to 2 MB per sec from a shard for all consumers in a shared mode OR per consumer in an enhanced fan-out mode
Retention period is between 1 day (default) to 365 days
Latency is around 200 ms (real-time)
Ability to replay (or reprocess) previously processed data records (in the same order)
The following are the two supported Capacity Modes:

Provisioned Mode
- Choose the number of shards and scale them manually or using API
- Each shard has a write rate limit of 1000 records per sec or 1 MB per sec
- Each shard has a read rate limit of 2 MB per sec
- Pay per shard per hour
On-demand Mode
- No need to provision or manage capacity
- Default capacity has a write rate limit of 200000 records per sec or 200 MB per sec
- Default capacity has a read rate limit of 400 MB per sec of up to 2 default consumers OR up to 20 consumers in the enhanced fan-out mode
- Automatic scaling based on the observed throughput peak over the last 30 days period
- Pay per stream per hour PLUS data in/out per GB

The following is the summary of the various features/capabilities of Kinesis Data Firehose:

Is a fully managed, auto scaled service for delivering NEAR real-time streaming data to destinations such as any custom HTTP endpoint, third-party services like Datadog, Splunk, etc, OR other AWS services such as S3, Redshift (via S3), OpenSearch etc
Buffers incoming streaming data in memory to a certain size or for a certain period of time before delivering it to destinations
The data record sent into data firehose can be up to 1 MB in size
Producers send data records into a data firehose
Producers can be any custom application, Kinesis Data Streams, CloudWatch, etc
Can perform data transformations on the data records using Lambda functions before delivering to destinations
Support for many data formats, data conversions, and data compression
The data records sent to data firehose is NOT persisted or stored
Data can be written to destinations in batches via buffering (default of 300 secs buffer interval OR 5 MB of buffer size)
Data records that failed reach a destination can be written to a S3 backup bucket for analysis
Pay only for the data going through data firehose