PolarSPARC |
Cassandra Quick Notes :: Part - 3
Bhaskar S | *UPDATED*11/28/2023 |
Overview
In Part-2 of this article series, we performed the necessary setup to start a 3-node Apache Cassandra cluster and got our hands dirty with some concepts.
In this part, we will continue to leverage the 3-node Apache Cassandra cluster setup. In addition, we will elaborate more on the concepts from Part-2 and introduce additional concepts on Apache Cassandra.
Concepts - Level III
The following section elaborates and expands on the concepts on Apache Cassandra:
In Cassandra, a Quorum indicates a number of nodes (greater than 1) in a cluster that need to be in sync for STRONG consistency. It is calculated using the formula: (replication_factor / 2) + 1
For write operations, Cassandra supports the following consistency levels:
ANY - data for a given row key must be written to at least one node in the cluster. If all the replicas for the row key are down, the node through which the client connected can store a hint and the data so the write can succeed (the data stored with the hint is not available for reads from clients; it is used to restore consistency when any of the replicas comes online again later)
ONE - data for a given row key must be written to the commitlog and the memtable of at least one replica node
TWO - data for a given row key must be written to the commitlog and the memtable of at least 2 closest replica nodes in the cluster
THREE - data for a given row key must be written to the commitlog and the memtable of at least 3 closest replica nodes in the cluster
QUORUM - data for a given row key must be written to the commitlog and the memtable on a quorum of replica nodes in the cluster
ALL - data for a given row key must be written to the commitlog and the memtable on all replica nodes in the cluster
LOCAL_QUORUM - data for a given row key must be written to the commitlog and the memtable on a quorum of replica nodes in the same data center as the node through which the client is connected
EACH_QUORUM - data for a given row key must be written to the commitlog and the memtable on a quorum of replica nodes in each of the data centers of the cluster
For read operations, Cassandra supports the following consistency levels:
ONE - data for a given row key must be returned from the node as determined by the Snitch
TWO - most current data for a given row key must be returned after 2 replica nodes in the cluster have responded
THREE - most current data for a given row key must be returned after 3 replica nodes in the cluster have responded
QUORUM - most current data for a given row key must be returned after a quorum of replica nodes in the cluster have responded
ALL - most current data for a given row key must be returned after all the replica nodes in the cluster have responded. The read operation fails if a replica does not respond
LOCAL_QUORUM - most current data for a given row key must be returned after a quorum of replica nodes in the same data center as the node through which the client is connected have responded
EACH_QUORUM - most current data for a given row key must be returned after a quorum of replica nodes in each of the data centers of the cluster have responded
When a client performs a read operation, it connects to one of the nodes in the cluster and request data specifying a consistency level. That read operation will block until the consistency level is satisfied. If it is detected that one or more replicas have an older value, then a background operation called Read Repair is performed to update the replicas to the latest value
When a client performs a delete operation on a value, it is not physically removed immediately. Instead, a special marker called Tombstone is placed to indicate a delete
Cassandra allows one to specify an optional expiration time (also called time-to-live or TTL) in seconds when setting a value of a column. After the expiration time has elapsed, the column value is considered deleted and is marked with a Tombstone
Periodically Cassandra runs a background process called Compaction, which merges SSTables to merge columns for a row key and remove any columns with Tombstones . This process also rebuilds the Primary and Secondary indexes
Rows in a Column are distributed across nodes in the cluster based on the row key. Cassandra Partitioner determines which node(s) in the cluster will store the data for a given row key. Cassandra can be configured to use any of the following data Partitioners:
Murmur3Partitioner - This is the default data partitioning strategy used by Cassandra, which uniformly distributes the rows across the nodes in the cluster. This strategy maps the Partition Key (first part of the Primary Key) to a hash based token value, which in turn determines the node in the cluster that owns the row (based on the token value). This is the recommended Partitioner to be used in Cassandra
RandomPartitioner - This data partitioning strategy evenly distributes the rows across the nodes in the cluster. This strategy maps the Partition Key to an MD5 hash value, which determines the node in the cluster that should own the row
ByteOrderedPartitioner - This data partitioning strategy maps the Row Key to raw bytes, which then determines the node in the cluster that owns that row. This Partitioner arranges the rows in lexical order based on the raw bytes of the Row Key. This Partitioner is optimal for range queries (example: query rows whose keys are between 'A' and 'K'). The flip side to using this Partitioner is that the rows may not be evenly distributed across the nodes in the cluster resulting in hotspots
Cassandra uses a Replication Strategy (also called Placement Strategy ) to determine how the replicas will be distributed across nodes in a cluster. Cassandra provides the following Replication Strategies:
SimpleStrategy - This is the default placement strategy and works well for a cluster of nodes in a single data center. This strategy places the first replica on the node as indicated by the Partitioner. Additional replica(s) are placed in the node(s) located along the logical circle of the Ring in a clockwise direction
NetworkTopologyStrategy - This is the recommended strategy for most production deployments. It should be used when the nodes in the cluster are spread across multiple data centers. This placement strategy allows us control over the location(s) as well as the number of replicas to be placed in the nodes across multiple data centers
Cassandra's replication strategy (or placement strategy) relies on a Snitch to determine the physical location of nodes and their proximity to each other. A Snitch typically uses the octets in the IP address to determine how nodes in the cluster are laid out in racks and data centers. Cassandra provides the following Snitchs:
SimpleSnitch - This is the default snitch used by Cassandra and works well for a cluster of nodes in a single data center. It returns the list of all the nodes in the ring
PropertyFileSnitch - This snitch should be used when the nodes in the cluster are spread across multiple data centers and we need control in mapping IP addresses of nodes to racks and data centers. The mapping of the IP addresses of nodes to racks and data centers is explicitly configured via a property file called cassandra-topology.properties
GossipingPropertyFileSnitch - This is the recommended snitch for production deployments. To use this snitch, one must specify the datacenter and rack in each node in a property file called cassandra-rackdc.properties. The Gossip communication propagates this info across the nodes in the cluster. If the properties file cassandra-topology.properties is present, it is used as a fallback to identify the cluster topology
RackInferringSnitch - This snitch should be used when the nodes in the cluster are spread across multiple data centers. It uses the octets of the IP address to infer the topology of the cluster. The assumption made is that the second octet of the IP address indicates the data center and the third octet of the IP address indicates the rack in the particular data center, and the fourth (and final) octet of the IP address indicates the node within the rack. In other words, given an IP address of A.B.C.D , the octet B indicates the data center, the octet C indicates the rack and the octet D indicates the node
Hands-on with Cassandra - III
To check the Apache Cassandra cluster nodes are up and running, execute the following command:
$ docker exec -it cas-node-3 nodetool describecluster
The following would be the typical output:
Cluster Information: Name: Cassandra Cluster Snitch: org.apache.cassandra.locator.SimpleSnitch DynamicEndPointSnitch: enabled Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: d03783d7-b468-3c1a-82f1-8e30b2edde8b: [172.18.0.2, 172.18.0.3, 172.18.0.4] Stats for all nodes: Live: 3 Joining: 0 Moving: 0 Leaving: 0 Unreachable: 0 Data Centers: datacenter1 #Nodes: 2 #Down: 0 Database versions: 5.0.0-alpha2: [172.18.0.2:7000, 172.18.0.3:7000, 172.18.0.4:7000] Keyspaces: system_auth -> Replication class: SimpleStrategy {replication_factor=1} system_distributed -> Replication class: SimpleStrategy {replication_factor=3} system_traces -> Replication class: SimpleStrategy {replication_factor=2} system_schema -> Replication class: LocalStrategy {} system -> Replication class: LocalStrategy {}
To launch the CQL command-line interface, execute the following command:
$ docker run -it --rm --name cas-client --network cassandra-db-net cassandra:5.0 cqlsh cas-node-1
The following will be the output:
WARNING: cqlsh was built against 5.0-alpha2, but this server is 5.0. All features may not work! Connected to Cassandra Cluster at cas-node-1:9042 [cqlsh 6.2.0 | Cassandra 5.0-alpha2 | CQL spec 3.4.7 | Native protocol v5] Use HELP for help. cqlsh>
On success, CQL will change the command prompt to "cqlsh>".
To create a Keyspace called mytestks3, input the following command at the "cqlsh>" prompt:
cqlsh> CREATE KEYSPACE IF NOT EXISTS mytestks3 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 2};
There will be no output.
To use the Keyspace called mytestks3, input the following command at the "cqlsh>" prompt:
cqlsh> USE mytestks3;
There will be no output and the input prompt would change to "cqlsh:mytestks3>".
To create a Table called club_member with a composite Primary Key, input the following command at the "cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> CREATE TABLE IF NOT EXISTS club_member (member_id uuid, member_name text, member_phone text, member_since timestamp, zip text, PRIMARY KEY ((member_id), zip));
There will be no output.
To check the various settings on the table club_member, input the following command at the "cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> DESCRIBE TABLE club_member;
The following will be the output:
CREATE TABLE mytestks3.club_member ( member_id uuid, zip text, member_name text, member_phone text, member_since timestamp, PRIMARY KEY (member_id, zip) ) WITH CLUSTERING ORDER BY (zip ASC) AND additional_write_policy = '99p' AND allow_auto_snapshot = true AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND cdc = false AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '16', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND memtable = 'default' AND crc_check_chance = 1.0 AND default_time_to_live = 0 AND extensions = {} AND gc_grace_seconds = 864000 AND incremental_backups = true AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair = 'BLOCKING' AND speculative_retry = '99p';
For the table club_member, the column member_id will be the Partiton Key and the column zip will be the Clustering Key.
To insert a row into the table club_member, input the following commands at the " cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> INSERT INTO club_member (member_id, member_name, member_since, zip) VALUES (uuid(), 'alice', '2020-05-15', '10001');
There will be no output.
To select all the rows from the table book_catalog, input the following command at the "cqlsh:mytestks>" prompt:
cqlsh:mytestks3> SELECT * FROM book_catalog;
The following will be the output:
member_id | zip | member_name | member_phone | member_since --------------------------------------+-------+-------------+--------------+--------------------------------- 63b807d0-a629-477c-a085-98cdf8a03770 | 10001 | alice | null | 2020-05-15 00:00:00.000000+0000 (1 rows)
To select the values of the member_id, the member_phone, and the latest update timestamp (column metadata) of member_phone for all the rows from the table club_member, input the following command at the "cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> SELECT member_id, member_phone, writetime(member_phone) FROM club_member;
The following will be the output:
member_id | member_phone | writetime(member_phone) --------------------------------------+--------------+------------------------- 63b807d0-a629-477c-a085-98cdf8a03770 | null | null (1 rows)
Since no value was provided for the column member_phone during the INSERT operation, the last timestamp for the column is null, which makes sense.
To update the value of the column member_phone for the row in club_member with the primary key values of 63b807d0-a629-477c-a085-98cdf8a03770 and 10001 , input the following command at the "cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> UPDATE club_member SET member_phone = '212-111-1111' WHERE member_id = 63b807d0-a629-477c-a085-98cdf8a03770 AND zip = '10001';
There will be no output.
Once again, to select the values of the member_id, the member_phone, and the latest update timestamp (column metadata) of member_phone for all the rows from the table club_member, input the following command at the "cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> SELECT member_id, member_phone, writetime(member_phone) FROM club_member;
The following will be the output:
member_id | member_phone | writetime(member_phone) --------------------------------------+--------------+------------------------- 63b807d0-a629-477c-a085-98cdf8a03770 | 212-111-1111 | 1701205884772244 (1 rows)
To display the current consistency level set of the Apache Cassandra cluster, input the following command at the "cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> CONSISTENCY;
The following will be the output:
Current consistency level is ONE.
To set the consistency level of the Apache Cassandra cluster to QUORUM, input the following command at the "cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> CONSISTENCY QUORUM;
The following will be the output:
Consistency level set to QUORUM.
To check which nodes in the cluster stored the row for 63b807d0-a629-477c-a085-98cdf8a03770, execute the following command:
$ docker exec -it cas-node-1 nodetool getendpoints mytestks3 club_member 63b807d0-a629-477c-a085-98cdf8a03770
The following will be the output:
172.18.0.2 172.18.0.4
Note the IP address 172.18.0.4 is that of the node cas-node-3.
Let us now take DOWN the node cas-node-3 from our cluster by executing the following command:
$ docker stop cas-node-3
There will be no output.
To select the values of all the columns from the table club_member for the row with the row key of member_id = 63b807d0-a629-477c-a085-98cdf8a03770 AND zip = '10001', input the following command at the " cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> SELECT * FROM club_member WHERE member_id = 63b807d0-a629-477c-a085-98cdf8a03770 AND zip = '10001';
The following will be the output:
NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 172.18.0.2:9042 datacenter1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'consistency\': \'QUORUM\', \'required_replicas\': 2, \'alive_replicas\': 1}')})
Let us bring UP the node cas-node-3 back in the cluster by executing the following command:
$ docker run --rm --name cas-node-3 --hostname cas-node-3 --network cassandra-db-net -e CASSANDRA_SEEDS=cas-node-1 -u $(id -u $USER):$(id -g $USER) -v $CASSANDRA_HOME/data3:/var/lib/cassandra/data -v $CASSANDRA_HOME/etc/cassandra:/etc/cassandra -v $CASSANDRA_HOME/logs3:/opt/cassandra/logs cassandra:5.0
Once again, to select the values of all the columns from the table club_member for the row with the row key of member_id = 63b807d0-a629-477c-a085-98cdf8a03770 AND zip = '10001', input the following command at the "cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> SELECT * FROM club_member WHERE member_id = 63b807d0-a629-477c-a085-98cdf8a03770 AND zip = '10001';
The following will be the output:
member_id | zip | member_name | member_phone | member_since --------------------------------------+-------+-------------+--------------+--------------------------------- 63b807d0-a629-477c-a085-98cdf8a03770 | 10001 | alice | 212-111-1111 | 2020-05-15 00:00:00.000000+0000 (1 rows)
Let us now take DOWN the node cas-node-2 from our cluster by executing the following command:
$ docker stop cas-node-2
One more time, to select the values of all the columns from the table club_member for the row with the row key of member_id = 63b807d0-a629-477c-a085-98cdf8a03770 AND zip = '10001', input the following command at the "cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> SELECT * FROM club_member WHERE member_id = 63b807d0-a629-477c-a085-98cdf8a03770 AND zip = '10001';
The following will be the output:
member_id | zip | member_name | member_phone | member_since --------------------------------------+-------+-------------+--------------+--------------------------------- 63b807d0-a629-477c-a085-98cdf8a03770 | 10001 | alice | 212-111-1111 | 2020-05-15 00:00:00.000000+0000 (1 rows)
Notice that the node cas-node-2 being DOWN did not impact the query as the consistency level was met.
Inserting new rows will not be an issue as the consitency level will be met because there are still two nodes cas-node-1 and cas-node-3 that are UP and running.
To drop the entire table club_member, input the following command at the " cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> DROP TABLE club_member;
There will be no output.
To drop the entire keyspace mytestks3, input the following command at the " cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> DROP KEYSPACE mytestks3;
There will be no output.
To exit the CQL command-line interface, input the following command at the " cqlsh:mytestks3>" prompt:
cqlsh:mytestks3> exit;
There will be no output.
To stop the nodes Apache Cassandra cluster, execute the following commands in that order:
$ docker stop cas-node-3
$ docker stop cas-node-2
$ docker stop cas-node-1
This concludes the hands-on demonstration for this part using the 3-node Apache Cassandra cluster !!!
References