21aim45a-Dbms Module-5
21aim45a-Dbms Module-5
21aim45a-Dbms Module-5
MODULE -5
SUBJECT CODE: 21AIM45A
MODULE –V : SYLLABUS
✓ Relational databases require that schemas be defined before you can add data. For example, you
might want to store data about your customers such as phone numbers, first and last name,
address, city and state – a SQL database needs to know this in advance.
✓ This fits poorly with agile development approaches, because each time you complete new features,
the schema of your database often needs to change. So if you decide, a few iterations into
development, that you'd like to store customers' favorite items in addition to their addresses and
phone numbers, you'll need to add that column to the database, and then migrate the entire
database to the new schema.
✓ If the database is large, this is a very slow process that involves significant downtime.
If you are frequently changing the data your application stores , this downtime may also be frequent.
✓ There's also no way, using a relational database, to effectively address data that's completely
unstructured or unknown in advance.
✓ NoSQL databases are built to allow the insertion of data without a predefined schema. That
makes it easy to make significant application changes in real-time, without worrying about
service interruptions – which means development is faster, code integration is more reliable,
and less database administrator time is needed.
What is unstructured data?
• Unstructured data is data that you cannot store in the traditional structure of a relational
database.
• It is sometimes referred to qualitative data — you can derive meaning from it, but it can also
be incredibly ambiguous and difficult to parse.
• Because of the nature of unstructured data, document-type storage options, which are non-
relational (NoSQL) databases, are ideal since they can more easily adapt their data model with
a flexible schema.
•Survey responses:
Every time you gather feedback from your customers, you're collecting
unstructured data. For example, surveys with text responses and open-ended
comment fields are unstructured data.
NoSQL Database
❖NoSQL Database is used to refer a non-SQL or non
relational database.
❖ In the early 1970, Flat File Systems are used. Data were stored in
flat files and the biggest problems with flat files are each company
implement their own flat files and there are no standards. It is very
difficult to store data in the files, retrieve data from files because
there is no standard way to store data.
❖ Then the relational database was created by E.F. Codd and these
databases answered the question of having no standard way to
store data.
❖ But later relational database also get a problem that it could not
handle big data, due to this problem there was a need of database
which can handle every types of problems then NoSQL database
was developed.
Characteristics of NoSQL databases
2.Key-value stores: These databases store data as key-value pairs, and are
optimized for simple and fast read/write operations.
3.Column-family stores:
These databases store data as column families, which are sets of columns that are
treated as a single entity. They are optimized for fast and efficient querying of large
amounts of data.
4.Graph databases:
These databases store data as nodes and edges, and are designed to handle
complex relationships between data.
Graph store database
Graph store database that is designed to handle data with
complex relationships and interconnections.
In a graph database, data is stored as nodes and edges, where
nodes represent entities and edges represent the relationships
between those entities.
Graph stores include Neo4J and HyperGraphDB.
Key Features of NoSQL
•Minimizing the requirement for translation between the format in which the
information is stored to the format in which the data appears in code.
•As part of its core architecture, NoSQL databases originally were designed to
manage large amounts of information.
Types of NoSQL Database:
•Document-based databases
•Key-value stores
•Column-oriented databases
•Graph-based databases
Document-Based Database: Key features of document database:
❖ The document-based database is a ❖ Flexible schema: Documents in the
nonrelational database. Instead of storing the database has a flexible schema. It
data in rows and columns (tables), it uses the means the documents in the
documents to store the data in the database. database need not be the same
schema.
❖ A document database stores data in JSON, ❖ Faster creation and maintenance:
BSON, or XML documents. the creation of documents is easy
and minimal maintenance is
❖ Documents can be stored and retrieved in a required once we create the
form that is much closer to the data objects document.
used in applications which means less ❖ No foreign keys: There is no
translation is required to use these data in the dynamic relationship between two
applications. documents so documents can be
independent of one another. So,
❖ In the Document database, the particular there is no requirement for a
elements can be accessed by using the index foreign key in a document
value that is assigned for faster querying. database.
❖ Open formats: To build a document
we use XML, JSON, and others.
Key-Value Stores:
❖A key-value store is a nonrelational database. The simplest form of a
NoSQL database is a key-value store. Every data element in the database
is stored in key-value pairs. The data can be retrieved by using a unique
key allotted to each element in the database. The values can be simple
data types like strings and numbers or complex objects.
❑ This ensures availability and reliability of data. If some of the data goes offline, the rest
of the database can continue to run.
❑ The CAP theorem, originally introduced as the CAP principle, can be used to explain some of the competing
requirements in a distributed system with replication. It is a tool used to make system designers aware of the
trade-offs while designing networked shared-data systems.
❑ The three letters in CAP refer to three desirable properties of distributed systems with replicated data:
consistency (among replicated copies), availability (of the system for read and write operations) and partition
tolerance (in the face of the nodes in the system being partitioned by a network fault).
❑ The CAP theorem states that it is not possible to guarantee all three of the desirable properties –
consistency, availability, and partition tolerance at the same time in a distributed system with data
replication.
CAP Theorem
The theorem states that networked shared-data systems can only strongly support two of
the following three properties:
•Availability –
Availability means that each read or write request for a data item will either be processed successfully or will
receive a message that the operation cannot be completed. Every non-failing node returns a response for all the
read and write requests in a reasonable amount of time. The key word here is “every”. In simple terms, every
node (on either side of a network partition) must be able to respond in a reasonable amount of time.
•Partition Tolerance –
Partition tolerance means that the system can continue operating even if the network connecting the nodes has a
fault that results in two or more partitions, where the nodes in each partition can only communicate among each
other. That means, the system continues to function and upholds its consistency guarantees in spite of network
partitions. Network partitions are a fact of life. Distributed systems guaranteeing partition tolerance can gracefully
recover from partitions once the partition heals.
CAP theorem NoSQL database types
NoSQL databases are ideal for distributed network applications.
3.CA database:
A CA database delivers consistency and availability across all nodes. It can’t do this if there is a
partition between any two nodes in the system, however, and therefore can’t deliver fault
tolerance.
In a distributed system, partitions can’t be avoided. So, while we can discuss a CA distributed
database in theory, for all practical purposes a CA distributed database can’t exist. This doesn’t
mean you can’t have a CA database for your distributed application if you need one.
Many relational databases, such as PostgreSQL, deliver consistency and availability and can
be deployed to multiple nodes using replication.
SQL NoSQL
❖ In relational database management systems, data is
arranged in tables. ❖ NoSQL databases are non-relational, which
eliminates the need for connecting tables.
❖ SQL provided a standard interface to interact with ❖ Their built-in sharing and high availability
relational data, allowing analysts to connect tables by capabilities ease horizontal scaling.
merging on common fields.
❖ Not a popular choice for unstructured data, big data ❖ In this era of growth within cloud, big data, and
and real time data applications mobile and web applications, NoSQL databases
provide that speed and scalability, making it a
popular choice for their performance and ease of
use.
Shashikala KS
Introduction to Cassandra
• Architecture
• Gossip protocol
• Snitches, Virtual Nodes
• write consistency level and write process
• read consistency level and read data operation
• indexing, compaction
• Anti-entropy
• Tombstones
Shashikala KS
Cassandra
• Apache Cassandra is an open source distributed database management system designed
to handle large amounts of data across many commodity servers, providing high
availability with no single point of failure.
• Apache Cassandra was developed by Avinash Lakshman and Prashant Malik when
both were working as engineers at Facebook. The database was designed to
power Facebook's inbox search feature, making it easy for users to quickly find the
conversations and other content they were looking for.
• Cassandra offers robust support for clusters spanning multiple datacenters, with
asynchronous masterless replication allowing low latency operations for all clients.
Shashikala KS
• It means that you are not limited by the storage and processing capabilities
of a single machine. If the data gets larger, add more machines. If you need
more parallelism (ability to access data in parallel/concurrently), add more
machines.
• This means that a node going down does not mean that all the data is lost .If
a distributed mechanism is well designed, it will scale with a number of
nodes. Cassandra is one of the best examples of such a system
Shashikala KS
2.High availability
A high-availability system is one that is ready to serve any request at any
time. High availability is usually achieved by adding redundancies. So, if
one part fails, the other part of the system can serve the request. To a
client, it seems as if everything works fine.
So, you can just deploy Cassandra over cheap commodity hardware or a
cloud environment, where hardware or infrastructure failures may occur.
Shashikala KS
3. Replication
Cassandra has a pretty powerful replication mechanism. Cassandra treats every node in the same manner.
Data need not be written on a specific server (master), and you need not wait until the data is written to all the
nodes that replicate this data (slaves).
So, there is no master or slave in Cassandra, and replication happens asynchronously. This means that
the client can be returned with success as a response as soon as the data is written on at least one server.
We can use each data center to perform different tasks without overloading the other data centers.
This is a powerful support when you do not have to worry whether users in Japan with a data center in Tokyo and users
in the US with a data center in Virginia, are in sync or not.
Shashikala KS
Cassandra
data model
Shashikala KS
Features of Cassandra
Benefit of having such a flexible data storage mechanism is that you can have arbitrary number of cells with
customized names and have a partition key store data as a list of tuples (a tuple is an ordered set; in this case, the tuple
is a key-value pair).
This comes handy when you have to store things such as time series, for example, if you want to use Cassandra to store
your Facebook timeline or your Twitter feed or you want the partition key to be a sensor ID and each cell to represent a
tuple with name as the timestamp when the data was created and value as the data sent by the sensor
Unlike RDBMS, Cassandra does not have relations. This means relational logic will be needed to be
handled at the application level.
This also means that we may want to denormalize the database because there is no join and to avoid
looking up multiple tables by running multiple queries. Denormalization is a process of adding
redundancy in data to achieve high read performance
Shashikala KS
Types of keys
In the context of Cassandra, There are five types of keys
• Primary key: This is the column or a group of columns that uniquely defines a row of the CQL table.
• Composite key: This is a type of primary key that is made up of more than one column. Sometimes, the composite key
is also referred to as the compound key.
• Partition key: Cassandra’s internal data representation is large rows with a unique key called row key. It uses these row
key values to distribute data across cluster nodes. Since these row keys are used to partition data, they as called partition
keys. When you define a table with a simple key, that key is the partition key. If you define a table with a composite key,
the first term of that composite key works as the partition key. This means all the CQL rows with the same partition key
lives on one machine.
• Clustering key: This is the column that tells Cassandra how the data within a partition is ordered (or clustered). This
essentially provides presorted retrieval if you know what order you want your data to be retrieve in.
In the preceding example, id is the primary key and also the partition key. There is no
clustering. It is a simple key.
In the preceding example, we have a composite key that uses country and state to uniquely
define a CQL row.
The country column is the partition key, so all the rows with the same country node will
belong to the same node/machine. The rows within a partition will be sorted by the state
names.
So, when you query for states in the US, you will encounter the row with California before
the one with New York
Shashikala KS
Cassandra Architecture
RDBMS has been tremendously successful in the last 40 years (the relational data model was proposed in its first
incarnation by Codd, E.F. (1970) in his research paper,” A Relational Model of Data for Large Shared Data Banks”).
However, in early 2000s, big companies such as Google and Amazon, which have a gigantic load on their databases to
serve, started to feel bottlenecked with RDBMS, even with helper services such as Memcache on top of them.
As a response to this, Google came up with BigTable and Amazon with Dynamo
While using RDBMS for a complicated web application, We face problems such as slow queries due to complex joins,
expensive vertical scaling, and problems in horizontal scaling.
Due to these problems, indexing takes a long time. At some point, you may have chosen to replicate the database, but
there was still some locking, and this hurts the availability of the system. This means that under a heavy load, locking
will cause the user’s experience to deteriorate
Shashikala KS
NoSQL
NoSQL is a blanket term for the databases that solve the scalability issues which are
common among relational databases. This term, in its modern meaning, was first coined
by Eric Evans
NoSQL solutions provide scalability and high availability, but may not guarantee
ACID: atomicity, consistency, isolation, and durability in transactions.
Many of the NoSQL solutions, including Cassandra, sit on the other extreme of
ACID, named BASE, which stands for basically available, soft-state, eventual
consistency
Shashikala KS
Architecture of Cassandra
Cassandra is a distributed, decentralized, fault tolerant, eventually consistent, linearly scalable, and
column-oriented data store.
Cassandra is made to easily deploy over a cluster of machines located at geographically different places.
There is no central master server, so no single point of failure, no bottleneck, data is replicated, and a
faulty node can be replaced without any downtime. It’s consistent.
It is linearly scalable, which means that with more nodes, the requests served per second per node will not
go down.
Also, the total throughput of the system will increase with each node being added. And finally, it’s column
oriented, much like a map (or better, a map of sorted maps) or a table with flexible columns where each
column is essentially a key-value pair
Shashikala KS
Architecture of Cassandra
Cassandra has a tabular data presentation. It is rather a dictionary-like structure where each entry holds another sorted
dictionary/map.
This model is more powerful than the usual key-value store and it is named a table, formerly known as a column
family.
• Each row is identified by a row key with a number (token), and unlike spreadsheets, each cell may have
its own unique name within the row.
• In Cassandra, the columns in the rows are sorted by this unique column name.
• Also, since the number of partitions is allowed to be very large (1.7*1038), it distributes the rows almost
uniformly across all the available machines by dividing the rows in equal token groups.
• Tables or column families are contained within a logical container or name space called keyspace.
• The maximum number of cells per partition is limited by the Java integer’s max value, which is about 2
billion. So, one partition can hold a maximum of 2 billion cells.
• When you define a table with a primary key that has just one column, the primary key also serves as the
partition key.
• But when you define a composite primary key, the first column in the definition of the primary key works
as the partition key.
• So, all the rows (bunch of cells) that belong to one partition key go into one partition. This means that
every partition can have a maximum of X rows, where X = (2*109/number_of_columns_in_a_row).
Gossip protocol
Cassandra uses the gossip protocol for inter-node communication.
It’s a way for nodes to build the global map of the system with a small number of local interactions.
Cassandra uses gossip to find out the state and location of other nodes in the ring (cluster).
The gossip process runs every second and exchanges information with, at the most, three other nodes in the cluster.
Nodes exchange information about themselves and other nodes that they come to know about via some other gossip
session. This causes all the nodes to eventually know about all the other nodes.
Gossip messages have a version number associated with them. So, whenever two nodes gossip, the older information
about a node gets overwritten with newer information.
Cassandra uses an anti-entropy version of the gossip protocol that utilizes Merkle trees to repair unread data.
A node X sends a syn message to a node Y to initiate gossip. Y, on receipt of this syn message, sends an ack message
back to X. To reply to this ack message, X sends an ack2 message to Y completing a full message round.
Shashikala KS
Syn and ack are also known as a message handshake. It is a mechanism that allows two machines trying to
communicate to each other to negotiate the parameters of connection before transmitting data. Syn stands for
“synchronize packet” and ack stands for “acknowledge packet”.
The Gossiper module is linked to failure detection. The module, on hearing one of these messages,
updates the failure detector with the liveness information that it has gained.
If it hears GossipShutdownMessage, the module marks the remote node as dead in the failure detector. The
node to be gossiped with is chosen based on the following rules:
• Gossip to a random live endpoint
• Gossip to a random unreachable endpoint
• If the node in point 1 was not a seed node or the number of live nodes is less than the number of seeds,
gossip to a random seed
Shashikala KS
Ring representation
Virtual nodes
In Cassandra 1.1 and previous versions, when you create a cluster or add a node, you manually assign its initial
token. This extra work the database should handle internally.
Apart from this, adding and removing nodes requires manual resetting token ownership for some or all nodes.
This is called rebalancing.
Yet another problem was replacing a node. In the event of replacing a node with a new one, the data (rows that
the to-be-replaced node owns) is required to be copied to the new machine from a replica of the old machine
For a large database, this could take a while because we are streaming from one machine.
To solve all these problems, Cassandra 1.2 introduced virtual nodes (vnodes).
Shashikala KS
• With vnodes, each machine can have multiple smaller ranges and these ranges are
automatically assigned by Cassandra.
• If you have a 30 ring cluster and a node with 256 vnodes had to be replaced.
• If nodes are well-distributed randomly across the cluster, each physical node in
remaining 29 nodes will have 8 or 9 vnodes (256/29) that are replicas of vnodes on the
dead node. In older versions, with a replication factor of 3, the data had to be streamed
from three replicas (10 percent utilization). In the case of vnodes, all the nodes can
participate in helping the new node get up.
• The other benefit of using vnodes is that you can have a heterogeneous ring where some
machines are more powerful than others, and change the vnodes ‘ settings such that the
stronger machines will take proportionally more data than others.
Shashikala KS
Compaction
A read require may require Cassandra to read across multiple SSTables to get a result. This is wasteful, costs
multiple (disk) seeks, may require a conflict resolution, and if there are too many SSTables, it may slow down the
read.
Off the shelf, Cassandra offers two types of compaction mechanisms: size-tiered compaction strategy and level
compaction strategy
The compaction process starts when the number of SSTables on disk reaches a certain threshold (configurable).
Although the merge process is a little I/O intensive, it benefits in the long term with a lower number of disk seeks
during reads.
Benefits of compaction:
Removal of expired tombstones (Cassandra v0.8+)
Merging row fragments
Rebuilds primary and secondary indexes
Shashikala KS
For example, let’s say you have a compaction threshold set to four.
When the number of SSTables surpasses the threshold, the compaction thread triggers. This compacts the four
equal-sized SSTables into one.
Temporarily, you will have two times the total SSTable data on your disk. Another thing to note is that SSTables
that get merged have the same size. So, when the four SSTables get merged to give a larger SSTable of size, say G,
the buckets for the rest of the to-be-filled SSTables will be G each.
So, the next compaction will take an even larger space while merging. The SSTables, after merging, are marked as
deletable. They get deleted at a garbage collection cycle of the JVM, or when Cassandra restarts. The compaction
process happens on each node and does not affect other nodes. This is called minor compaction. This is
automatically triggered, system controlled, and regular.
Shashikala KS
Cassandra is designed to distribute and manage large data loads across multiple nodes in
a cluster constituted of commodity hardware.
3.Replication factor:
It determines number of copies of data(replicas) that will be stored across nodes in a cluster
Shashikala KS
4. Anti-Entropy and Read Repair:
When a client connects to any node in the cluster to read data, the key question is how
many nodes will be read before responding to the client is based on the consistency level
specified by the client.
If the consistency is not met, the read operation blocks as few of the nodes may respond
with an out-of-date value.
For repairing unread data, Cassandra uses an anti-entropy version of the gossip
protocol, implies comparing all the replicas of each piece of data and updating each
replica to the newest version.
Shashikala KS
5. Writes in Cassandra:
Assume a client initiates a write request. Where does his write get written to? It is first
written to the commit log. A write is taken as successful only if it is written to the
commit log.
The next step is to push the write to a memory resident data structure called Memtable.
When the number of objects stored in the Memtable reaches a defined threshold value,
the contents of Memtable are flushed to the disk in a file called SSTable(Sorted String
Table).
It is possible to have multiple Memtables for a single column family. One of them is
current and rest are waiting to be flushed
Hinted Handoffs: Shashikala KS
Cassandra works on the philosophy that it will always be available for writes.
Assume there are 3 nodes A,B,C in a cluster. Node C is down for some reason.
We maintain a replication factor of 2.
The client makes a write request to Node A.
Node A is the coordinator and serves as proxy between the client and the nodes on which
the replica is to be placed(B & C)
Shashikala KS
Consistency Level (CL): is the number of replica nodes that must acknowledge a read or write request
for the whole operation/query to be successful.
Write CL controls how many replica nodes must acknowledge that they received and wrote the
partition.
Read CL controls how many replica nodes must send their most recent copy of partition to the
coordinator.
Write Consistency
Write consistency means having consistent data (immediate or eventual) after your write query to your Cassandra
cluster. You can tune the write consistency for performance (by setting the write CL as ONE) or immediate
consistency for critical piece of data (by setting the write CL as ALL) Following is how it works:
The coordinator forwards the write request (INSERT, UPDATE or DELETE) to all replica nodes whatever write CL
you have set.
The coordinator waits for n number of replica nodes to respond. n is set by the write CL.
The coordinator sends the response back to the client.
Shashikala KS
The strong consistency formula is R+W>RF, where R stands for read consistency, W stands for write
consistency, and RF stands for replication factor. Consistency is deemed weak if R+W <RF.
Cassandra offers five consistency levels:
•Consistency Level ONE − For the read or write operation to be regarded successful, just one node needed
to recognize it. This category offers the least amount of consistency guarantee but the most availability and
performance. This level is appropriate for applications where data integrity is not crucial, such as tracking
user activity.
Read Consistency
Read consistency refers to having same data on all replica nodes for any read request. Following is how it works:
Read CL = ONE gives you benefit of speed, Cassandra only contacts one closest/fastest replica node, so throughput of
the read request will be lower so performance will be higher. Also, it might so happen that 2 out of 3 replica nodes
might be down or query might be failed and you will still get a result because CL = ONE, so you have highest
availability.
For all these benefits, the price you pay is lower consistency. So, your consistency guarantees are much lower.
Read CL = QUORUM (Cassandra contacts majority of the replica nodes) gives you a nice balance, it gives you high
performance reads, good availability and good throughput.
Anti-entropy Shashikala KS
Cassandra promises eventual consistency and read repair is the process that does this part. Read
repair, is the process of fixing inconsistencies among the replicas at the time of read.
Example:
Let’s say we have three replica nodes, A, B, and C, that contain a data X. During an update, X
is updated to X1 in replicas A and B, but it fails in replica C for some reason.
On a read request for data X, the coordinator node asks for a full read from the nearest node
(based on the configured snitch) and digest of data X from other nodes to satisfy consistency
level. The coordinator node compares these values (something like digest(full_X) ==
digest_from_node_C).
If it turns out that the digests are the same as the digests of the full read, the system is
consistent and the value is returned to the client. On the other hand, if there is a mismatch, full
data is retrieved and reconciliation is done and the client is sent the reconciled value. After this,
in the background, all the replicas are updated with the reconciled value to have a consistent
view of data on each node
Shashikala KS
The following figure shows this process: Client queries for data X, from a node C
(coordinator) C gets data from replicas R1, R2, and R3 reconciles Sends reconciled
data to client If there is a mismatch across replicas, a repair is invoked .
What if the node containing hinted handoff data dies, and the data that contains the hint is
never read? Is there a way to fix them without read?
Anti-entropy compares all the replicas of a column family and updates the replicas to the
latest version.
This happens during major compaction. It uses Merkle trees to determine discrepancies
among the replicas and fixes them.
Shashikala KS
Merkle tree
Merkle tree is a hash tree where leaves of the tree hashes hold actual data in column family
and non-leaf nodes hold hashes of their children.
The unique advantage of a Merkle tree is that a whole subtree can be validated just by
looking at the value of the parent node.
So, if nodes on two replica servers have the same hash values, then the underlying data is
consistent and there is no need to synchronize.
If one node passes the whole Merkle tree of a column family to another node, it can
determine all the inconsistencies.
Shashikala KS
The following figure shows the Merkle tree to determine a mismatch in hash values at
the parent nodes due to the difference in the underlying data:
Shashikala KS
Tombstones
• Cassandra is a complex system with its data distributed among commit logs, MemTables, and SSTables on a node. The
same data is then replicated over replica nodes.
• Deletion, to an extent, follows an update pattern, except Cassandra tags the deleted data with a special value, and marks it as a
tombstone. In Cassandra, deleted data is not immediately purged from the disk. Instead, Cassandra writes a special value, known as a
tombstone, to indicate that data has been deleted. Tombstones prevent deleted data from being returned during reads, and will
eventually allow the data to be dropped via compaction.
• This marker helps future queries, compaction, and conflict resolution. Let’s step further down and see what happens
when a column from a column family is deleted.
• A client connected to a node (a coordinator node may not be the one holding the data that we are going to mutate),
issues a delete command for a column C, in a column family CF.
• If the consistency level is satisfied, the delete command gets processed. When a node, containing the row key receives a
delete request, it updates or inserts the column in MemTable with a special value, namely tombstone.
• The tombstone basically has the same column name as the previous one; the value is set to the Unix epoch. The
timestamp is set to what the client has passed. When a MemTable is flushed to SSTable, all tombstones go into it as any
regular column will.
Shashikala KS
For immediate deletion, i.e., without waiting for any period of time, Cassandra supports the
‘DROP KEYSPACE’ and ‘DROP TABLE’ statements
Besides these two statements, there are two other methods of deletion, as shown in Figure 3,
in which immediate deletion does not take place. These methods are: 1. user issues a delete
command 2. user marks record (row/column) with TTL (time to live).
Shashikala KS
In the 1st method, when the delete command is issued, a tombstone, which marks the
record for deletion, gets added to a particular record. The tombstone is then written to
the SSTable. When the built-in grace period (expressed as gc_grace_seconds) expires,
the tombstone is deleted. The default value for grace period is 864,000 seconds (ten
days)
However, each table can have its own value for the grace period. In order to identify the
grace period of a tombstone, the ‘cassandra.yaml’ file can be used.
In the 2nd method, the user marks row/column with a TTL value. In order to identify the
TTL associated with a record, one can use the ‘TTL function’. The TTL function takes 1
argument, which is a column name, and returns the corresponding associated TTL. After the
TTL value expires, that particular record is marked with a tombstone, and then this
tombstone is written to SSTable. Then, the tombstone is deleted when the grace period
expires.
In order to identify whether a record has a tombstone associated with it, it is required to
check if the associated SSTable dump output of the corresponding partition has the
‘deletion_info’ tombstone marker. The ‘deletion_info’ tombstone marker present in the
SSTable dump output includes a ‘marked_deleted’ timestamp which indicates the time at
which a record was requested to be deleted as well as a ‘local_delete_time’ timestamp
indicating the local time at which the Cassandra server received a request to delete a record.
Hence, the ‘marked_deleted’ timestamp is specified by the user or the user application, and
the ‘local_delete_time’ timestamp is set by the Cassandra server.
Shashikala KS
The SSTable event log dump has the ‘deletion_info’ associated with a deleted record. To
read the log dump of SSTable, ‘cassandratools’ can be installed, and ‘nodetool’ can then be
used to create a snapshot, and consequently, an ‘sstabledump’ can be generated and stored
in a file. An analyst can access this log dump to identify information pertaining to the
deletion of a record.