PolarSPARC |
AWS Block and File Storage - Quick Notes
Bhaskar S | 12/14/2023 |
Elastic Block Store (EBS)
Elastic Block Store (or EBS for short) provides block level storage volume that can be attached to an EC2 instance over the network.
The following are some of the notable points of EBS:
Behave like raw, unformatted block devices
Can be Boot volumes
When attached to an Instance, they are exposed as storage Volumes that persist independently from the life of the Instance
One can dynamically change the configuration of a volume attached to an Instance
Suitable for use as the primary storage for use-cases (file systems, databases) that require fine granular access to raw, unformatted, block-level storage
Can be attached ONLY to a single Instance
Bound to a specifc Availability Zone in a Region
Multiple volumes can be attached to an Instance
Automatically replicated within an Availability Zone (not another Availability Zone) to protect from failures
Provides seamless support for data encryption, both data-at-rest and data-in-transit between Instances and volumes
Volumes automatically detach from an Instance with their data intact when the Instance terminates and the Delete on Termination option is NOT set
By default the Delete on Termination option is enabled for the root volume and NOT for additional volumes
The following are the various EBS volume types:
General Purpose SSD (gp3)
SSD-backed volumes which balance price and performance for a broad range of transactional workloads
Can be attached ONLY to a single Instance
Latest generation offering with the lowest cost (20 percent lower than gp2)
Allows one to scale volume performance independently of volume size
Provides single-digit millisecond latency and 99.8 percent to 99.9 percent volume durability
Volume sizes can range from 1 GB to 16 TB
Consistent baseline IOPS performance of 3000 IOPS
Maximum upto 16000 IOPS at a ratio of 500 IOPS per GB size (for volumes of size 32 GB or larger)
Consistent baseline Throughput of 125 MBps
Maximum Throughput of 1000 MBps at a ratio of 0.25 MBps per IOPS
Provisioned IOPS SSD (io1)
Highest performance SSD volume for IOPS intensive and Througput intensive workloads
Deliver the provisioned IOPS performance 99.9 percent of the time
Typical for database workloads that are sensitive to storage performance and consistency
Provides 99.8 percent to 99.9 percent volume durability
Volume sizes can range from 4 GB to 16 TB
Can provision from 100 IOPS up to 64000 IOPS per volume
Maximum ratio of provisioned IOPS to requested volume size (in GB) is 50:1
Supports Multi-Attach option on this volume which allows it to be attached to up to 16 Instances in the same Availability Zone
Provisioned IOPS SSD (io2) Block Express
Built for Instances that run on Nitro System (bare metal capabilities that are built for high performance, high availability, and high security)
With the highest durability and lowest latency, ideal for running performance-intensive, mission-critical workloads
Provides 99.999 percent volume durability
Volume sizes can range from 4 GB to a maximum of 64 TB
Maximum of 256000 IOPs per volume with an IOPS:GB ratio of 1000:1
Supports Multi-Attach option on this volume which allows it to be attached to up to 16 Instances in the same Availability Zone
Throughput Optimized HDD (st1)
Uses low-cost magnetic HDD storage that is designed for frequently accessed, throughput-intensive workloads
Performance defined in terms of Throughput rather than IOPS
Cannot be a bootable volume
NO support for Multi-Attach
Good for the processing needs of Big Data, Data Warehouse type workloads
Maximum volume Throughput of 500 MBps
Volume sizes can range from 125 MB to 16 TB
Cold HDD (sc1)
The LOWEST cost magnetic HDD storage designed for less frequently accessed workloads
Performance defined in terms of Throughput rather than IOPS
Cannot be a bootable volume
NO support for Multi-Attach
Volume sizes can range from 125 MB to 16 TB
Maximum volume Throughput of 250 MBps
Designed for less frequently accessed workloads such as data archiving
EBS Snapshots
A EBS Snapshots is a point-in-time copy of an EBS volume that is stored in AWS S3.
The following are some of the features:
First snapshot is a FULL backup of the volume. Subsequent snapshots are INCREMENTAL backup (only the blocks that have changed)
Contains all of the information that is needed to restore to a new volume
Allows one to create an identical volume in another Availability Zone in same Region
Allows one to COPY a snapshot to a different Region with Encryption
For an encrypted volume, the encryption state is retained in the snapshot
Data Lifecycle Manager
AWS Data Lifecycle Manager (DLM) allows one automate the creation, retention, and deletion of EBS snapshots and EBS-backed AMIs.
The following are some of the features:
Helps protect valuable data by enforcing a regular backup schedule
Helps create standardized AMIs that can be refreshed at regular intervals
Helps reduce the storage costs by deleting old backups
Helps retain backups as required by compliance and regulations
Helps create disaster recovery backup policies that backup data to other Regions or accounts
Elastic File System
AWS Elastic File System (EFS) is a fully managed and elastic file storage that lets one share file data without provisioning or managing storage capacity and performance. It is essentially a Network File System (NFS) storage for Linux ONLY workloads.
The following are some of the features:
Can be mounted to many EC2 Instances in many Availability Zones within a Region
It is highly available, durable, scalable, and expensive storage
Pay per use (in GB) and scales automatically
Can grow to Petabyte automatically based on usage
One can control access via Security Group
Offers the ability to encrypt data in transit and at rest
Encryption at rest MUST be configured at creation time
There are two modes of operation, which MUST be set at creation time, and are as follows:
EFS Performance Mode
General Purpose (default) - for latency sensitive use-cases (ex: web server, cms, etc)
Max I/O - higher throughput, highly parallel (ex: big data processing)
EFS Throughput Mode
Bursting (default) - 1 TB = 50 MBps + burst up to 100 MBps
One Zone - only in one Availability Zone with backup for redundancy (ex: dev workloads)
FSx
AWS FSx allows one to launch high-performance, 3rd-party file systems on AWS. It is a fully managed service that handles the storage provisioning, patching, and backups.
There are four FSx options to choose from as follows:
FSx for Lustre - for compute-intensive, high-performance file system that can be used for Machine Learning and High Performance Computing workloads. It allows for parallel and distributed processing of 'hot data' AND it integrates with S3 to provide 'cold data' storage with quick access
FSx for OpenZFS - for the popular OpenZFS file system that is compatible with the NFS protocol
FSx for NetApp ONTAP - for NetApp's popular ONTAP file system that is compatible with NFS, SMB, and iSCSI protocols
FSx for Windows File Server - for native Windows NTFS filesystem. It also supports SMB protocol and hence can be mounted on Linux EC2 Instances. Has support for Multi-AZ
AWS Storage Gateway
AWS Storage Gateway is a service that connects an on-prem storage with cloud-based storage to provide seamless and secure integration.
The following are some of the features of Storage Gateway:
Bridges on-prem data with cloud data
Can be useful for backup/restore, disaster recovery
There are four types of Storage Gateways - S3 File Gateway, FSx File Gateway, Volume Gateway, Tape Gateway
S3 File Gateway makes the AWS S3 buckets accessible via the NFS or SMB protocols for the on-prem applications and caches the most recently used files
FSx File Gateway provides native access to FSx for Windows File Server on the AWS cloud to the SMB clients on-prem and caches the most recently used files
Volume Gateway provides an iSCSI interface for AWS S3 and allows for snapshots of on-prem volumes to the AWS cloud
Tape Gateway mimics the tape backup interface and allows the backup of on-prem data using leading vendor backup software to AWS S3 in the AWS cloud
References