PolarSPARC |
Elastic Cloud Infrastructure: Scaling and Automation - Summary Notes - Part 3
Bhaskar S | 11/09/2019 |
Deployment Manager
Deployment Manager uses a system of highly structured templates and configuration files to document the infrastructure in an easily readable and understandable format
Deployment Manager conceals the actual Cloud API calls so the user does not need to write code to generate the infrastructure and can focus on the definition of the infrastructure
This following illustration introduces the Deployment Manager:
Deployment Manager is an infrastructure deployment service that automates the creation and management of GCP resources for a user
A user just specifies all the resources needed for their application in a declarative format and deploy their configuration
This deployment can be repeated over and over with consistent results and the user can delete a whole deployment with one command or click
The benefit of a declarative approach is that it allows the user to specify what the configuration should be and let the system figure out the steps to take
Deployment Manager will deploy resources in parallel
A user can even abstract parts of their configuration into individual building blocks or Templates that can be used for other configurations
Deployment Manager uses the underlying APIs of each GCP service to deploy the users resources (VM instances, Instance Templates and Groups, VPC Networks, Firewall Rules, VPN Tunnels, Cloud Routers, Load Balancers, etc)
The Configuration Template can be written in Jinja2 or Python
Each resource must contain a Name, Type, and Properties
This following illustration shows the examples for creating an Auto Mode VPC Network and HTTP Firewall Rule using Jinja2 Template:
This following illustration shows the example for creating the top-level Deployment Manager Template using the YAML syntax. It imports the previous Jinja2 Templates:
From the example in Fig.3 above, notice the selfLink reference for the network name. This ensures that the VPC Network is created before the Firewall Rule. This is very important because Deployment Manager creates all the resources in parallel
Deployment Manager allows one to treat their infrastructure like software, which helps to decrease costs, reduce risk, and deploy faster by capturing infrastructure as code
Managed Services
With Managed Services, users have access to services without having to manage the underlying resources, such as compute and storage, that implement the service
Examples of Managed Services are - BigQuery, Cloud Dataflow, Cloud Dataprep, Cloud Dataproc, etc
BigQuery
BigQuery is GCP's Serverless, highly scalable, and cost effective petabyte scale Data Warehouse that allows for super-fast queries using the processing power of Google's infrastructure
Serverless here means that servers or compute engine instances are obfuscated so that the user does not have to worry about the Infrastructure
This following illustration introduces BigQuery:
BigQuery has a free usage tier to help one get started
This following illustration shows an example of a query in BigQuery:
The query processes over 4.1 terabytes of data and takes less than a minute to execute
Cloud Dataflow
Cloud Dataflow is a managed service for executing a wide variety of data processing patterns
This following illustration introduces Cloud Dataflow:
Cloud Dataflow is essentially a fully managed service for transforming and enriching data in stream and batch modes with equal reliability and expressiveness
Cloud Dataflow supports fast, simplified pipeline development via expressive SQL, Java, and Python APIs using the Apache Beam SDK which provides a rich set of windowing and session analysis primitives as well as an ecosystem of source and sink connectors
This following illustration shows an example Cloud Dataflow architecture:
Cloud Dataprep
Cloud Dataprep is an intelligent data service for visually exploring, cleaning, and preparing structured and unstructured data for analysis reporting and Machine Learning
This following illustration introduces Cloud Dataprep:
With automatic schema, data types, possible joins, and anomaly detection, one can skip time-consuming Data Profiling and focus on Data Analysis
This following illustration shows an example Cloud Dataprep architecture:
Cloud Dataproc
Cloud Dataproc is a fast, easy to use, fully managed Cloud service for running Apache Spark and Apache Hadoop clusters in a simpler way
This following illustration introduces Cloud Dataproc:
A user only pays for the resources they use with per second billing
Cloud Dataproc clusters are quick to start, scale, and shut down with each of these operations taking 90 seconds or less on average
This following decision tree helps one choose between Cloud Dataproc and Cloud Dataflow:
Hands-on with Deployment Manager
Display a list of the available network resource types
gcloud deployment-manager types list | grep network
The following will be the typical output:
compute.beta.subnetwork compute.alpha.subnetwork compute.v1.subnetwork compute.beta.network compute.v1.network compute.alpha.network
Display a list of the available firewall resource types
gcloud deployment-manager types list | grep firewall
The following will be the typical output:
compute.v1.firewall compute.alpha.firewall compute.beta.firewall
Display a list of the available instance resource types
gcloud deployment-manager types list | grep instance
The following will be the typical output:
spanner.v1.instance compute.v1.instanceGroupManager sqladmin.v1beta4.instance compute.alpha.instanceGroup bigtableadmin.v2.instance compute.v1.instanceGroup compute.beta.instanceGroupManager compute.alpha.instanceTemplate compute.v1.instanceTemplate compute.beta.instance bigtableadmin.v2.instance.table compute.beta.instanceTemplate compute.alpha.instanceGroupManager compute.beta.instanceGroup compute.v1.instance compute.alpha.instance
The following is an example instance creation template called instance-template.jinja using Jinja:
The following is an example Deployment Manager template called config.yaml in YAML format:
Create a deployment called dminfra without actually instantiating the underlying resources based on a provided config file
gcloud deployment-manager deployments create dminfra --config=config.yaml --preview
The following will be the typical output:
The fingerprint of the deployment is 1XwQ4k1HSewzjUzugVP9Hg== Waiting for create [operation-1572807933118-59675e22749d1-f439e19d-6130d205]...done. Create operation operation-1572807933118-59675e22749d1-f439e19d-6130d205 completed successfully. NAME TYPE STATE ERRORS INTENT mynet-eu-vm compute.v1.instance IN_PREVIEW [] CREATE_OR_ACQUIRE mynet-us-vm compute.v1.instance IN_PREVIEW [] CREATE_OR_ACQUIRE mynetwork compute.v1.network IN_PREVIEW [] CREATE_OR_ACQUIRE mynetwork-allow-http-ssh-rdp-icmp compute.v1.firewall IN_PREVIEW [] CREATE_OR_ACQUIRE
Update the deployment named dminfra and actually instantiate the underlying resources
gcloud deployment-manager deployments update dminfra
The following will be the typical output:
The fingerprint of the deployment is 1s9PcYLCzo_AMUMdNNS9Fg== Waiting for update [operation-1572808086835-59675eb50d159-fc91a93e-09cf08db]...done. Update operation operation-1572808086835-59675eb50d159-fc91a93e-09cf08db completed successfully. NAME TYPE STATE ERRORS INTENT mynet-eu-vm compute.v1.instance COMPLETED [] mynet-us-vm compute.v1.instance COMPLETED [] mynetwork compute.v1.network COMPLETED [] mynetwork-allow-http-ssh-rdp-icmp compute.v1.firewall COMPLETED []
References
Coursera - Elastic Cloud Infrastructure: Scaling and Automation
Elastic Cloud Infrastructure: Scaling and Automation - Summary Notes - Part 1
Elastic Cloud Infrastructure: Scaling and Automation - Summary Notes - Part 2