PolarSPARC |
Essential Cloud Infrastructure: Foundation - Summary Notes - Part 2
Bhaskar S | 10/20/2019 |
Virtual Machines
Virtual Machines (VMs) are the most common infrastructure component in GCP that are provided by the Compute Engine. A VM is similar but not identical to a hardware computer. VMs consists of a virtual CPU, some amount of memory, disk storage, and an IP address
This following illustration shows the Compute Engine (in green) along with other processing options:
A Micro VM shares a CPU with other virtual machine so one can get a VM with less capacity at a lower cost
Some VMs offer a burst capability meaning that the virtual CPU will run above its rated capacity for a brief period using the available shared physical CPU
The primary use-case for Compute Engine is any general workload, especially an enterprise application that was designed to run on a server infrastructure
When a user creates an instance in a Zone, their instance will use the default processor supported in that Zone
This following illustration shows the features of the Compute Engine:
This following illustration shows the affect of CPU on Compute Engine:
Choice of CPU in a Compute Engine will affect the network throughput. Specifically the network will scale at 2 gigabits per second for each CPU core, up to a theoretical maximum throughput of 16 gigabytes per second, which will be achieved if we have 8 virtual CPUs
Each Virtual CPU (vCPU) is implemented as a single hardware hyper-thread on one of the available CPU platforms
This following illustration shows the affect of Storage on Compute Engine:
SSDs are designed to give a higher number of IOPS per dollar versus standard disks which give higher amount of capacity per dollar
Local SSDs have even higher throughput and lower latency than Persistent SSD disks, because they are attached to the physical hardware and attached to the VM
The data stored on the local SSDs persists only until the VM instance is stopped or deleted
A local SSD is typically used as a swap disk
A user can create instances with up to 8 separate 375 gigabyte local SSD partitions for a total of 3 terabytes of local SSD space for each instance
Standard and non-local SSDs can be size up to 64 terabytes for each instance
The performance of these disk scales with each gigabyte of space allocated
This following illustration shows the affect of Networking on Compute Engine:
One can do Regional HTTP Load Balancing and Network Load Balancing
A Load Balancer is essentially a set of traffic rules that are coming into the Google network and VPC is applying the rules destined for the IP address subnet range
This following illustration shows the VM access options:
The creator of a VM instance has full root privileges on that instance
On a Linux instance, the creator has SSH capability and can use the GCP Console to grant SSH capability to other users
On a Windows instance, the creator can use the GCP Console to generate a username and password. After that, anyone who knows the username and password can connect to the instance using a remote desktop protocol or RDP client
This following illustration shows the VM lifecycle:
When a user defines all the properties for an instance and creates the instance, the instance enters the Provisioning state. In this state the resources such as CPU, memory, and disk are being reserved for the instance, but the instance itself is not running yet
Next, the instance moves to the Staging state, where resources have been acquired and the instance is prepared for launch. Specifically in this state, Compute Engine is adding IP addresses, booting up the system image, and booting up the system
Next, the instance move to the Running state, where it will go through pre-configured startup scripts and enable SSH or RDP access
A user can do several things while their instance is running. For example, they can Live Migrate their VM to another host in the same Zone instead of requiring their instance to be stopped and restarted. This allows GCP to perform maintenance that is integral to keeping the infrastructure protected and reliable without interrupting any of the user VMs
While a user instance is Running, they can also move their VM to a different Zone. They can take a Snapshot of the VMs Persistent disk(s), can export the system image or reconfigure metadata
Some actions require one to stop the VM such as, wanting to upgrade the VM by adding more CPU. When the instance enters this state, it will go through pre-configured shutdown scripts and end in the Terminated state. From this state, one can choose to either restart instance which would bring it back to its Provision state or delete it. One also has the option to reset a VM which is similar to pressing the reset button on a computer. This action wipes the memory content of the machine and resets the VM to its initial state. The instance remains in the Running state throughout the reset
This following illustration shows the options for changing the VM state from Running:
If a user is restarting, rebooting, stopping, or even deleting an instance, the shutdown process will take about 90 seconds
For a Preemptible VM, if the instance is not stopped after 30 seconds, Compute Engine sends an ACPI G3 mechanical off signal to the operating system
This following illustration shows the VM availability policies:
The default maintenance behavior for instances is to Live Migrate. One can change the behavior to terminate the instance during maintenance events instead
If a VM is terminated due to a crash or other maintenance event, the instance is automatically restarted by default, but this can also be changed
The VM availability policies can be configured both during the instance creation and while an instance is running by configuring the Automatic Restart and On Host Maintenance options
This following illustration shows the features of a stopped VM:
When a VM is terminated, one does not pay for memory and CPU resources. However, there will be charges for any attached disks and reserved IP addresses
Some properties of a VM are integral to the VM and are established when the VM is created. They are the Machine Type, the CPU platform, or the Zone
A VM with multiple network interfaces cannot be changed and has to be recreated
One cannot convert a non-preemptible instance into a preemptible one. This choice must be made at VM creation time. A preemptible instance can be interrupted at any time and is available at a lower cost
This following illustration shows the various Machine Types:
A Machine Type specifies a particular collection of virtual hardware resources available to a VM instance, including the system memory size, vCPU count, and maximum persistent disk capability
GCP offers several Machine Types that can be grouped into two categories - Predefined Machine Types and Custom Machine Types
Predefined Machine Types have fixed collection of resources that are managed by Compute Engine. They are available in multiple different classes. Each class has a predefined ratio of gigabytes of memory per vCPU. These are - Standard, High Memory, High CPU, Memory Optimized, Compute Optimized, and Shared-core Machine Types
Custom Machine Types on the other hand lets one specify the number of vCPUs and the amount of memory for the VM instance
Custom Machine Types are ideal for scenarios such as workloads that are not a good fit for the Predefined Machine Types or when there are workloads that require more processing power or more memory but do not need all of the upgrades that are provided by the next larger Predefined Machine Type
It costs slightly more to use a Custom Machine Type than equivalent Predefined Machine Type
With Custom Machine Types, only Machine Types with one vCPU or an even number of vCPUs can be created. Memory must be between 0.9 gigabytes and 6.5 gigabytes per vCPU by default. The total memory of the instance must be a multiple of 256 megabytes
For certain workloads a Custom Machine Type with 6.5 gigabytes of memory per vCPU may not be suitable. In such cases, one can pay an additional cost to get more memory per vCPU beyond the 6.5 gigabytes limit. This is referred to as Extended Memory
This following illustration shows the Standard Machine Types:
Standard Machine Types are suitable for tasks that have a balance of CPU and memory needs. Standard Machine Types have 3.75 gigabytes of memory per vCPU
All Machines Types (standard, high memory, high CPU, memory optimized, compute optimize, etc) support a maximum of 128 persistent disks with a total persistent disk size of 64 terabytes
High Memory Machine Types are ideal for tasks that require more memory relative to vCPUs. High Memory Machine Types have 6.5 gigabytes of system memory per vCPU
High CPU Machine Types are ideal for tasks that require more vCPUs relative to memory. High CPU Machine Types have 0.9 gigabytes of memory per vCPU
Memory Optimized Machine Types are ideal for tasks that require intensive use of memory with higher memory to vCPU ratios than High Memory Machine Types. Memory Optimized Machine Types have more than 14 gigabytes of memory per vCPU
Memory Optimized Machine Types are perfectly suited for in-memory databases and in-memory analytics such as SAP HANA and business warehouse workloads, genomic analysis, and SQL Analysis Services
Compute Optimized Machine Types are ideal for compute intensive workloads. These Machine Types are for the highest performance per core on Compute Engine
This following illustration shows the Shared-core Machine Types:
Shared-core Machine Types provide one vCPU that is allowed to run for a portion of the time on a single hardware hyper-thread on the host CPU running your instance
Shared-core instances can be more cost effective for running small non-resource intensive applications than other Machine Types
The f1-micro Machine Types offer bursting capabilities that allow instances to use additional physical CPU for short periods of time. Bursting happens automatically when the instance requires more physical CPU than was originally allocated. Bursts are not permanent and are only possible periodically
This following illustration shows the VM pricing model:
All of the vCPUs, GPUs and gigabyte of memory are charged a minimum of 1 minute. For example if a VM runs for 30 seconds, one will bill it for 1 minute of usage. After one minute, instances are charged in 1 second increments
Compute Engine uses a resource-based pricing model where each vCPU and each gigabyte of memory on Compute Engine is billed separately rather than as part of a single Machine Type
Compute Engine applies sustained use discounts to all of Predefined Machine Types usage in a Region collectively rather than to individual Machine Types
If a user workload is stable and predictable, one can purchase a specific amount of vCPU and memory for a discount off of normal prices in return for committing to a usage term of one or three years
The discount is up to 57 percent for most Predefined Machine Types or Custom Machine Types
The discount is up to 70 percent for Memory Optimized Machine Types
Compute Engine also has free usage limits
This following illustration shows the details about sustained use discounts:
Sustained use discounts are automatic discounts that users get for running specific Compute Engine resources, be it CPUs, memory, and GPU devices for a significant portion of the billing month
For example, when a user runs one of these resources for more than 25 percent of a month, Compute Engine automatically gives the user a discount for every incremental minute they use for that instance
The discount increases with usage and one can get up to 30 percent net discount for instances that run the entire month
Discounts reset at the beginning of each month. Hence it is advisable to create an instance on the first day of a month to take advantage of the full 30 percent discount
If one uses a VM for 50 percent of the month, they get an effective discount of 10 percent. If they use it for 75 percent of the month, they get an effective discount of 20 percent, and if they uses it for a 100 percent of the month, they get an effective discount of 30 percent
Compute Engine calculates sustained use discounts based on vCPU and memory usage across each Region and separately for each of the two categories - Pre-defined Machine Types and Custom Machine Types
This following illustration shows a sustained use example:
From the example above (in Fig.16), a user runs two instances in the us-central1 Region during a month. For the first half of the month, the user runs an n1-standard-4 instance with 4 vCPUs and 15 gigabytes of memory. For the second half of the month, the user runs a larger n1-standard-16 instance with 16 vCPUs and 60 gigabytes of memory
For the example above (in Fig.16), Compute Engine reorganizes these Machine Types into individual vCPUs and memory resources, then combines their usage (as shown on the bottom of Fig.16 above). 4 vCPUs and 15 gigabytes of memory for a full month and then 12 vCPUs and 45 gigabytes of memory for half of the month
This following illustration shows the details about Preemptible VMs:
A Preemptible VM is an instance that one can create and run at much lower prices than normal instances. One can get up to 80 percent discount with Preemptible VMs
VMs might be preempted at any time, and there is no charge if that happens within the first 10 minutes
Preemptible VMs are only going to live for up to 24 hours and one only gets a 30-second notification before the machine is preempted
Note there are no Live Migrations or no automatic restarts with Preemptible VMs. There are ways to keep restarting Preemptible VMs if one needs (ex: create monitoring to trigger an event that can then startup a new Preemptible VM)
One major use case of Preemptible VMs is running a Batch Processing job. If some of those instances terminate during processing, the job slows down but does not completely stop. Therefore Preemptible instances complete the batch processing tasks without placing additional workload on any existing instances and without requiring one to pay full price for additional normal instances
This following illustration shows using Sole-tenant nodes to isolate workloads:
If users have workloads (running on VMs) that require physical isolation from other workloads (on VMs) in order to meet some compliance requirements, one might want to consider using Sole-tenant Nodes
A Sole-tenant Node is a physical Compute Engine server that is dedicated to hosting VM instances only for the users specific project
Currently available Sole-tenant Node type can accommodate VM instances up to 96 vCPUs and 624 gigabytes of memory
If users have existing operating system licenses, they can bring them to Compute Engine using Sole-tenant Nodes while minimizing physical core usage with the In-place Restart feature
This following illustration shows the details about the Shielded VMs:
Shielded VMs offer verifiable integrity of VM instances so one can be confident that their instances have not been compromised by boot or kernel level malware or rootkits
Shielded VMs verifiable integrity is achieved through the use of Secure Boot, Virtual Trusted Platform Module (VTPM) enabled, and integrity monitoring
This following illustration shows details about the VM images:
An Image includes the boot loader, the operating system, the file system structure, any pre-configured software, and any other customizations
Some of these images are premium Images as indicated in parentheses with a p. These images will have per second charges after a one-minute minimum, with the exception of SQL Server images, which are charged per minute after a 10-minute minimum
Premium image prices vary with the Machine Type. However these prices are global and do not vary by Region or Zone
One can create and use a custom image by pre installing software that's been authorized for their particular Organization
Users also have the option of importing images from their on-prem or from another cloud provider
This following illustration shows details about the Boot disk:
Every single VM comes with a single root persistent disk where the base image will be loaded on
The root persistent disk is bootable and one can attach it to a VM and boot from it. It is durable and can survive if the VM terminates
To have a boot disk survive a VM deletion, one needs to disable the Delete boot disk when instance is deleted option in the instances properties
This following illustration shows the details about the Persistent disks:
A Persistent Disk is going to be attached to the VM through the network interface
One can perform snapshots of Persistent Disks which are incremental backups
The choice between HDD and SSD for Persistent Disk comes down to cost and performance
A cool feature of Persistent Disks is that one can dynamically resize them even while they are running and attached to a VM
One can also attach a Persistent Disk in read-only mode to multiple VMs. This allows one to share static data between multiple instances, which is cheaper than replicating their data to unique disks for individual instances
By default Compute Engine encrypts all data at rest. GCP handles and manages this encryption for the user without any additional actions on part of the user
To control and manage the encryption onsself, one can either use Cloud Key management service to create and manage key encryption keys, which is known as Customer Managed Encryption Keys or one can create and manage their own key encryption keys known as Customer Supplied Encryption Keys
This following illustration shows the details about the Local SSD disks:
A Persistent Disk is going to be attached to the VM through the network interface
Local SSD Disks are different from persistent disks in that they are physically attached to the VM. Therefore, these disk are ephemeral but provide very high IOPS
Currently one can attach up to 8 Local SSD disk with 375 gigabytes each resulting in a total of 3 terabytes
Data on the Local SSD disks will survive a reset but not a VM stop or terminate because these disks cannot be reattached to a different VM
This following illustration shows various disk options:
Persistent Disks can be rebooted and snapshotted but Local SSDs and RAM Disks are ephemeral
The Persistent Disk offer data redundancy because the data on each Persistent Disk is distributed across several physical disks
This following illustration shows the details on maximum disks:
For the Shared-core Machine Type, one can attach up to 16 Persistent Disks
For the Standard, High Memory, High CPU, Memory Optimized, and Compute Optimized Machine Types, one can attach up to 128 Persistent Disks
If one is planning on having a large amount of disk IO throughput, it will also compete with any network egress or ingress throughput
This following illustration shows the details about Metadata and Scipts:
Every VM instance stores its metadata on the Metadata Server
The Metadata Server is particularly useful in combination with Startup and Shutdown scripts because one can use the Metadata Server to programmatically get unique information about an instance without additional authorization
For example, one can write a startup script that gets the metadata key value pair for an instance's External IP address and use that information to setup a database. Because the default metadata keys are the same on every instance, one can reuse their script without having to update it for each instance, which helps one create less brittle code for their applications
This following illustration shows the details about moving an instance to a new Zone:
Another common action is to move an instance to a new Zone. If the move is within the same Region, one can automate the move by using the gcloud compute instances move command
If the move is to a different Region, one needs to manually do so by following the process outlined above in the illustration. This involves making a Snapshot of all Persistent Disks and creating new disks in the destination Zone from that Snapshot. Next, create a new VM in the destination Zone and attach the new Persistent Disks, assign an IP address, and update any references to the VM. Finally, delete the original VM, its disks and the snapshots
This following illustration shows the details about snapshotting Persistent Disks:
Another common action is to move an instance to a new Zone. If the move is within the same Region, one can automate the move by using the gcloud compute instances move command
Snapshots are available only for Persistent Disks and not for Local SSDs
Snapshots are useful for periodic backup of the data on a users Persistent Disks
Snapshots are incremental and automatically compressed, so one can create regular snapshots on a Persistent Disk faster and at a much lower cost than if they regularly created a full image of the disk
Snapshots can be restored to a new Persistent Disk thus allowing for a move to a new Zone
References
Coursera - Essential Cloud Infrastructure: Foundation
Essential Cloud Infrastructure: Foundation - Summary Notes - Part 1