NOSQL

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

NoSQL

A NoSQL (originally referring to "non SQL" or "non relational") database provides a mechanism for
storage and retrieval of data that is modeled in means other than the tabular relations used in
relational databases-Wikipedia.

NoSQL technology was pioneered by leading internet companies — including Google, Facebook,
Amazon, and LinkedIn. To overcome the limitations of 40-year-old relational database technology for
use with modern web applications. Today, enterprises are adopting NoSQL for a growing number of
use cases, a choice that is driven by four interrelated megatrends: Big Users, Big Data, the Internet of
Things, and Cloud Computing.

A traditional database product would prefer more predictable, structured data. A relational database
may require vertical and, sometimes horizontal expansion of servers, to expand as data or processing
requirements grow.

An alternative, more cloud-friendly approach is to employ NoSQL. The load is able to easily grow by
distributing itself over lots of ordinary, and cheap, Intel-based servers. A NoSQL database is exactly
the type of database that can handle the sort of unstructured, messy and unpredictable data that our
system of engagement requires.

NoSQL is a whole new way of thinking about a database. NoSQL is not a relational database. The
reality is that a relational database model may not be the best solution for all situations. The easiest
way to think of NoSQL, is that of a database which does not adhering to the traditional relational
database management system (RDMS) structure. Sometimes you will also see it revered to as 'not
only SQL'.

It is not built on tables and does not employ SQL to manipulate data. It also may not provide full
ACID (atomicity, consistency, isolation, durability) guarantees, but still has a distributed and fault
tolerant architecture.

The NoSQL taxonomy supports key-value stores, document store, BigTable, and graph databases.

MongoDB, for example, uses a document model, which can be thought of as a row in a RDBMS.
Documents, a set of fields (key-value pairs) map nicely to programming language data types. A
MongoDB database holds a collection which is a set of documents. Embedded documents and arrays
reduce need for joins, which is key for high performance and speed.

Why NoSQL? It's high performance with high availability, and offers rich query language and easy
scalability.

NoSQL is gaining momentum, and is supported by Hadoop, MongoDB and others. The NoSQL
Database site is a good reference for someone looking for more information.

Types of NoSQL Databases:

There are four general types of NoSQL databases, each with their own specific attributes:
● Key-Value store – we start with this type of database because these are some of the least
complex NoSQL options. These databases are designed for storing data in a schema-less way.
In a key-value store, all of the data within consists of an indexed key and a value, hence the
name. Examples of this type of database include: Cassandra, DyanmoDB, Azure Table
Storage (ATS), Riak, BerkeleyDB.
● Column store – (also known as wide-column stores) instead of storing data in rows,
these databases are designed for storing data tables as sections of columns of data, rather than
as rows of data. While this simple description sounds like the inverse of a standard database,
wide-column stores offer very high performance and a highly scalable architecture. Examples
include: HBase, BigTable and HyperTable.
● Document database – expands on the basic idea of key-value stores where “documents” are
more complex, in that they contain data and each document is assigned a unique key, which is
used to retrieve the document. These are designed for storing, retrieving, and
managing document-oriented information, also known as semi-structured data. Examples
include: MongoDB and CouchDB.
● Graph database – Based on graph theory, these databases are designed for data whose
relations are well represented as a graph and has elements which are interconnected, with an
undetermined number of relations between them. Examples include: Neo4J and Polyglot.

The following table lays out some of the key attributes that should be considered when evaluating
NoSQL databases.
Data Model Performance Scalability Flexibility Complexity Functionality

Key-value store High High High None Variable (None)

Column Store High High Moderate Low Minimal

Document Store High Variable (High) High Low Variable (Low)

Graph Database Variable Variable High High Graph Theory

Examples of NoSQL:

1. MongoDB

MongoDB was designed to support humongous databases. It's a NoSQL database with document-
oriented storage, full index support, replication and high availability, and more. Commercial support
is available through 10gen. Suitable Operating systems for MongoDB are Windows, Linux, OS X and
Solaris.
MongoDB is a cross-platform, document-oriented database system and currently the most popular
NoSQL database. It ditches the rigid schemas of RDBMS in favor of a binary form of JSON
documents – with dynamic schemas, giving data miners a lot of power.
You’ll find MongoDB at work in Shutterfly’s photo platform, eBay’s search suggestion, Forbes’s
storage system and MetLife’s “The Wall.”
Even better, it’s recently been updated. It’s features include:

● A faster JavaScript engine (V8)


● Text search (beta) and geospatial capabilities
● Concurrent index builds

2. Cassandra

Cassandra is NoSQL database. Apache’s popular key-value oriented database management system is
built to juggle large amounts of data across multiple commodity servers. It prides itself on
availability, scalability and fault-tolerance, and avoids bottlenecks and single points of failure.
Cassandra is developed by Facebook, It is now managed by the Apache Foundation. It's used by
many organizations with large, active datasets, including Netflix, Twitter, Urban Airship, Constant
Contact, Reddit, Cisco and Digg. Commercial support and services are available through third-party
vendors. Cassandra is OS Independent.
Because data is automatically replicated to multiple nodes, failed nodes can be replaced with no
downtime. That’s good news for data-critical applications.

Cassandra started its life at Facebook, when Avinash Lakshman and Prashant Malik created it to
power the Inbox Search feature. Today it’s working for Netflix, eBay, Twitter, Reddit, Ooyala and
more. The largest known Cassandra cluster boasts over 300 terabytes of data in over 400 computers.

3. OrientDB

OrientDB, a NoSQL DBMS written in Java, provides the schema-less flexibility of document
databases, the complexity of the graph model with direct relationships among document records, and
object orientation for added power and flexibility.
In addition to schema-less mode, OrientDB also functions in schema-full or hybrid mode. It ensures
reliability with ACID transactions and multi-master replication, and it’s fast – storing 150,000 records
per second on ordinary hardware.

4. CouchDB

Apache’s NoSQL database uses a trio of components making it extremely Web-friendly:

● JSON for documents


● JavaScript for MapReduce queries
● HTTP for an API
CouchDB, is an open source database that focuses on ease of use and on being "a database that
completely embraces the web". CouchDB works well with data that accumulates, changes
occasionally, answers to pre-defined queries, and needs versioning as a priority (e.g., CRM, CMS
systems).

5. Couchbase

Couchbase has a lot going on nowadays. Couchbase has had two recent updates – 2.0 added
document database capability, and 2.1 gave it cross-data center replication and improved storage
performance. As of 2015, its active open-source projects include Couchbase Client SDKs, Couchbase
Mobile, Couchbase Labs and its flagship NoSQL database, Couchbase Server.

Couchbase Server, originally known as Membase, is an open-source, distributed (shared-nothing


architecture)NoSQL document-oriented database that is optimized for interactive applications. These
applications must serve many concurrent users by creating, storing, retrieving, aggregating,
manipulating and presenting data. In support of these kinds of application needs, Couchbase is
designed to provide easy-to-scale key-value or document access with low latency and high sustained
throughput. It is designed to be clustered from a single machine to very large-scale deployments
spanning many machines.

Key considerations when choosing your NoSQL platform:


● Workload diversity – Big Data comes in all shapes, colors and sizes. Rigid schemas have no
place here; instead you need a more flexible design. You want your technology to fit your
data, not the other way around. And you want to be able to do more with all of that data –
perform transactions in real-time, run analytics just as fast and find anything you want in an
instant from oceans of data, no matter what from that data may take.
● Scalability – With big data you want to be able to scale very rapidly and elastically, whenever
and wherever you want. This applies to all situations, whether
scaling across multiple data centers and even to the cloud if needed.
● Performance – As has already been discussed, in an online world where nanosecond delays
can cost you sales, Big Data must move at extremely high velocities no matter how much you
scale or what workloads your database must perform. Performance of your environment,
namely your applications, should be high on the list of requirements for deploying a NoSQL
platform.
● Continuous Availability – Building off of the performance consideration, when you rely on
big data to feed your essential, revenue-generating 24/7 business applications, even high
availability is not high enough. Your data can never go down, therefore there should be no
single point of failure in your NoSQL environment, thus ensuring applications are always
available.
● Manageability – Operational complexity of a NoSQL platform should be kept at a
minimum. Make sure that the administration and development required to both maintain and
maximize the benefits of moving to a NoSQL environment are achievable.
● Cost – This is certainly a glaring reason for making the move to a NoSQL platform as
meeting even one of the considerations presented here with relational database technology can
cost become prohibitively expensive. Deploying NoSQL properly allows for all of the benefits
above while also lowering operational costs.
● Strong Community – This is perhaps one of the more important factors to keep in mind
as you move to a NoSQL platform. Make sure there is a solid and capable community around
the technology, as this will provide an invaluable resource for the individuals and teams that
will be managing the environment. Involvement on the part of the vendor should not
only include strong support and technical resource availability, but also consistent outreach to
the user base. Good local user groups and meetups will provide many opportunities for
communicating with other individuals and teams that will provide great insight into how to
work best with the platform of choice.

Applications of NoSQL:

• Advantages of NoSQL Databases over Relational Databases

• The Growth of Big Data

• Continuous Data Availability


• Real Location Independence

• Modern Transactional Capabilities

• Flexible Data Models

• Better Architecture

• Analytics and Business Intelligence

Challenges of NoSQL:

The promise of the NoSQL database has generated a lot of enthusiasm, but there are many obstacles
to overcome before they can appeal to mainstream enterprises. Here are a few of the top challenges.

● Maturity: RDBMS systems have been around for a long time. NoSQL advocates will argue that
their advancing age is a sign of their obsolescence, but for most CIOs, the maturity of the
RDBMS is reassuring. For the most part, RDBMS systems are stable and richly functional. In
comparison, most NoSQL alternatives are in pre-production versions with many key features yet
to be implemented. Living on the technological leading edge is an exciting prospect for many
developers, but enterprises should approach it with extreme caution.

● Support: Enterprises want the reassurance that if a key system fails, they will be able to get
timely and competent support. All RDBMS vendors go to great lengths to provide a high level of
enterprise support. In contrast, most NoSQL systems are open source projects, and although
there are usually one or more firms offering support for each NoSQL database, these companies
often are small start-ups without the global reach, support resources, or credibility of an Oracle,
Microsoft, or IBM.

● Analytics and business intelligence: NoSQL databases have evolved to meet the scaling
demands of modern Web 2.0 applications. Consequently, most of their feature set is oriented
toward the demands of these applications. However, data in an application has value to the
business that goes beyond the insert-read-update-delete cycle of a typical Web application.
Businesses mine information in corporate databases to improve their efficiency and
competitiveness, and business intelligence (BI) is a key IT issue for all medium to large
companies.

NoSQL databases offer few facilities for ad-hoc query and analysis. Even a simple query
requires significant programming expertise, and commonly used BI tools do not provide
connectivity to NoSQL. Some relief is provided by the emergence of solutions such as HIVE or
PIG, which can provide easier access to data held in Hadoop clusters and perhaps eventually,
other NoSQL databases. Quest Software has developed a product — Toad for Cloud Databases
— that can provide ad-hoc query capabilities to a variety of NoSQL databases.
● Administration: The design goals for NoSQL may be to provide a zero-admin solution, but the
current reality falls well short of that goal. NoSQL today requires a lot of skill to install and a lot
of effort to maintain.

● Expertise
There are literally millions of developers throughout the world, and in every business segment,
who are familiar with RDBMS concepts and programming. In contrast, almost every NoSQL
developer is in a learning mode. This situation will address naturally over time, but for now, it's
far easier to find experienced RDBMS programmers or administrators than a NoSQL expert.

References:

1) https://www.mongodb.org/
2) http://www.orientdb.com/orientdb/
3) http://www.cassandra.apache.org/
4) http://www.couchdb.apache.org/
5) http://www.couchbase.com/nosql-databases/couchbase-server
6) http://www.techrepublic.com/blog/10-things/10-things-you-should-know-about-nosql-
databases/

Conclusion:

NoSQL databases are becoming an increasingly important part of the database technology, and it can
offer real benefits, when used appropriately. However, enterprises should proceed with caution with
full awareness of the legitimate limitations and issues that are associated with these databases.

You might also like