Experiment 9

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

KJSCE/IT/SYBTech/SEM III/DMS/2024-25

KJSCE/IT/SYBTech/SEM III/DMS/2024-25

Batch: C-2 Roll No.: 16014223049 Experiment No.:9

Aim: To implement a NOSQL database using MongoDB and PostgreSQL.


___________________________________________________________________________

Resources needed: MongoDB, PostgreSQL


__________________________________________________________________________

Theory:

MongoDB:
MongoDB is a general-purpose document database designed for modern application
development and for the cloud. Its scale-out architecture allows you to meet the increasing
demand for your system by adding more nodes to share the load

MongoDB is having following key concepts,

● Documents: The Records in a Document Database


MongoDB stores data as JSON documents. The document data model maps naturally
to objects in application code, making it simple for developers to learn and use. The
fields in a JSON document can vary from document to document. Documents can be
nested to express hierarchical relationships and to store structures such as arrays. The
document model provides flexibility to work with complex, fast-changing, messy data
from numerous sources. It enables developers to quickly deliver new application
functionality. For faster access internally and to support more data types, MongoDB
converts documents into a format called Binary JSON or BSON. But from a developer
perspective, MongoDB is a JSON database.

● Collections: Grouping Documents


In MongoDB, a collection is a group of documents. Collection can be seen as tables,
but collections in MongoDB are far more flexible. Collections do not enforce a
schema, and documents in the same collection can have different fields. Each
collection is associated with one MongoDB database

● Replica Sets: For High Availability


In MongoDB, high availability is built right into the design. When a database is
created in MongoDB, the system automatically creates at least two more copies of the
data, referred to as a replica set. A replica set is a group of at least three MongoDB
instances that continuously replicate data between them, offering redundancy and
protection against downtime in the face of a system failure or planned maintenance.

● Sharding: For Scalability to Handle Massive Data Growth


A modern data platform needs to be able to handle very fast queries and massive
datasets using ever bigger clusters of small machines. Sharding is the term for
distributing data intelligently across multiple machines. MongoDB shards data at the
collection level, distributing documents in a collection across the shards in a cluster.
The result is a scale-out architecture that supports even the largest applications.

● Aggregation Pipelines: For Fast Data Flows


MongoDB offers a flexible framework for creating data processing pipelines called
aggregation pipelines. It features dozens of stages and over 150 operators and
expressions, enabling you to process, transform, and analyze data of any structure at
KJSCE/IT/SYBTech/SEM III/DMS/2024-25

scale. One recent addition is the Union stage, which flexibly aggregate results from
multiple collections.

Besides this MongoDB provides,


● variety of indexing strategies for speeding up the queries along with the Performance
Advisor, which analyses queries and suggests indexes that would improve query
performance
● Support for different programming languages which includes Node.js, C, C++,
C#, Go, Java, Perl, PHP, Python, Ruby, Rust, Scala, and Swift with actively
maintained library updated with newly added features.
● Various tools and utilities for monitoring MongoDB.
● Cloud services

PostgreSQL:
PostgreSQL is a powerful, open source object-relational database system that uses and
extends the SQL language combined with many features that safely store and scale the most
complicated data workloads. The origins of PostgreSQL date back to 1986 as part of the
POSTGRES project at the University of California at Berkeley and has more than 30 years of
active development on the core platform.
PostgreSQL comes with many features aimed to help developers build applications,
administrators to protect data integrity and build fault-tolerant environments, and help you
manage your data no matter how big or small the dataset. In addition to being free and open
source, PostgreSQL is highly extensible. For example, you can define your own data types,
build out custom functions, and even write code from different programming languages
without recompiling your database.
Some of the features of PostgreSQL are as follows,
● Data Types
o Primitives: Integer, Numeric, String, Boolean
o Structured: Date/Time, Array, Range / Multirange, UUID
o Document: JSON/JSONB, XML, Key-value (Hstore)
o Geometry: Point, Line, Circle, Polygon
o Customizations: Composite, Custom Types
● Data Integrity
o UNIQUE, NOT NULL
o Primary Keys
o Foreign Keys
o Exclusion Constraints
o Explicit Locks, Advisory Locks
● Concurrency, Performance
o Indexing: B-tree, Multicolumn, Expressions, Partial
o Advanced Indexing: GiST, SP-Gist, KNN Gist, GIN, BRIN, Covering indexes,
Bloom filters
o Sophisticated query planner / optimizer, index-only scans, multicolumn
statistics
o Transactions, Nested Transactions (via savepoints)
o Multi-Version concurrency Control (MVCC)
o Parallelization of read queries and building B-tree indexes
o Table partitioning
o All transaction isolation levels defined in the SQL standard, including
Serializable
o Just-in-time (JIT) compilation of expressions
● Reliability, Disaster Recovery
o Write-ahead Logging (WAL)
o Replication: Asynchronous, Synchronous, Logical
KJSCE/IT/SYBTech/SEM III/DMS/2024-25

o Point-in-time-recovery (PITR), active standbys


o Tablespaces
● Security
● Extensibility
● Internationalisation, Text Search

PostgreSQL types for NOSQL:


JSON data types are for storing JSON (JavaScript Object Notation) data. Such data
can also be stored as text, but the JSON data types have the advantage of enforcing
that each stored value is valid according to the JSON rules. There are also assorted
JSON-specific functions and operators available for data stored in these data types.

PostgreSQL offers two types for storing JSON data: json and jsonb. To implement
efficient query mechanisms for these data types PostgreSQL also provides
the jsonpath data type

The json and jsonb data types accept almost identical sets of values as input. The
major practical difference is one of efficiency. The json data type stores an exact
copy of the input text, which processing functions must reparse on each execution;
while jsonb data is stored in a decomposed binary format that makes it slightly
slower to input due to added conversion overhead, but significantly faster to process,
since no reparsing is needed. jsonb also supports indexing, which can be a
significant advantage.

__________________________________________________________________________

Procedure:
1. Create a repository of documents containing six family member of yours(including
yourself), with minimum seven attributes each, in POSTGRES
2. Perform selection and projection queries with different criterias on the created relation
3. Export the relation to json document
4. Import the document to MongoDB
5. Perform Insert, Search, Update, and Delete operations on the collection using
i. MongoDB Compass
ii. MongoDB Shell
Use link
https://drive.google.com/drive/folders/12QEkVpHVRgWtgKdyxx-x1v9sznyk
KnEv

6. Demonstrate pipeline in MongoDB with minimum three (03) stages.

___________________________________________________________________________

Results: (Queries depicting the above said activity performed individually and snapshots of
the results (if any))
USE COURIER NEW FONT WITH SIZE = 10 FOR QUERY STATEMENTS
___________________________________________________________________________
KJSCE/IT/SYBTech/SEM III/DMS/2024-25

Outcomes:

Basic SQL Queries covered in Experiment 3 and 4 →


KJSCE/IT/SYBTech/SEM III/DMS/2024-25

Exporting the SQL as .json file on Desktop →

Importing the .json file to MongoDB →


KJSCE/IT/SYBTech/SEM III/DMS/2024-25

Insert One Query →

Search where occupation is Student →

Count how many C.A’s →

___________________________________________________________________________
KJSCE/IT/SYBTech/SEM III/DMS/2024-25

Questions:

Explain with query implementation on relation created by you


1. Any five jsonb specific operators in PostgreSQL
2. Any five collection methods in MongoDB
(Besides db.collection.insertOne, db.collection.deleteOne,
db.collection.updateOne, db.collection.find)

Answers:

Ans 1.

● Retrieves the JSON value by key. -> jsonb -> 'key'.


● Retrieves the text value by key. ->> jsonb ->> 'key'.
● Retrieves the text value by a path of keys. #>> jsonb #>> '{key1, key2}'.
● Checks if the right JSON is contained within the left JSON. @> jsonb @> jsonb.
● Checks if a key exists in the JSON object. ? jsonb ? 'key'.

Ans 2.
● db.collection.aggregate(): Performs aggregation operations on documents using a pipeline.
● db.collection.findOne(): Returns a single document that matches the query criteria.
● db.collection.updateMany(): Updates multiple documents matching the query criteria.
● db.collection.deleteMany(): Deletes multiple documents matching the query.
● db.collection.distinct(): Returns an array of distinct values for a specified field across
documents.

Conclusion: (Conclusion to be based on outcomes achieved)

___________________________________________________________________________
Grade: AA / AB / BB / BC / CC / CD /DD

Signature of faculty in-charge with date


___________________________________________________________________________
References:

1. https://www.mongodb.com/basics
2. https://www.postgresql.org/about/
3. https://www.postgresql.org/docs/13/datatype-json.html

You might also like