Mongo DB Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Academic Year 2022-23

Course : SY BTech IT
Course Name : ADBMSL

(Assignment No 4)

Map reduce
Map-reduce is a data processing paradigm for condensing large volumes of data into useful
aggregated results. For map-reduce operations, MongoDB provides the mapReduce database
command.
Map-reduce supports operations on sharded collections, both as an input and as an output.
Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to
support deployments with very large data sets and high throughput operations.
Consider the following map-reduce operation:

Prof. Pallavi M. Tekade


Academic Year 2022-23
Course : SY BTech IT
Course Name : ADBMSL
> db.orders1.insertMany([{cust_id:"A123",amount:500,status:"A"},
{cust_id:"A123",amount:250,status:"A"},{cust_id:"B212",amount:200,status:"A"},
{cust_id:"A123",amount:300,status:"D"}])
{
"acknowledged" : true,
"insertedIds" : [
ObjectId("59cde47a1ea6abd9870348f9"), ObjectId("59cde47a1ea6abd9870348fa"),
ObjectId("59cde47a1ea6abd9870348fb"),
ObjectId("59cde47a1ea6abd9870348fc")
]

> db.orders.find()
{ "_id" : ObjectId("59cde47a1ea6abd9870348f9"), "cust_id" : "A123", "amount" : 500, "status" :
"A" }
{ "_id" : ObjectId("59cde47a1ea6abd9870348fa"), "cust_id" : "A123", "amount" : 250, "status" : "A" }
{ "_id" : ObjectId("59cde47a1ea6abd9870348fb"), "cust_id" : "B212", "amount" : 200, "status" : "A" }
{ "_id" : ObjectId("59cde47a1ea6abd9870348fc"), "cust_id" : "A123", "amount" : 300, "status" :
"D" }

> db.orders1.mapReduce( function(){emit (this.cust_id,this.amount);}, function(key,values){return


Array.sum(values)}, { query:{status:"A"}, out:"orders_totals" } )
{
"result" : "orders_totals",
"timeMillis" : 419,
"counts" : {
"input" : 3,
"emit" : 3,
"reduce" : 1,
"output" : 2
},
"ok" : 1
}

>db.orders_totals.find()

Prof. Pallavi M. Tekade


Academic Year 2022-23
Course : SY BTech IT
Course Name : ADBMSL
Another example
The collection c1 contains documents which store user_name of the users and the status of posts.

>>db.c1.insertMany([{"post_text": "India is an awesome country","user_name":


"sachin", "status":"active"},
{"post_text": "welcome to India","user_name": "saurav","status":"active"},
{"post_text": "I live in India","user_name": "yuvraj","status":"active"},
{"post_text": "India is great","user_name": "gautam","status":"active"}])

Now, we will use a mapReduce function on our c1 collection to select all the active posts, group
them on the basis of user_name and then count the number of posts by each user using the
following code.
>>db.c1.mapReduce(
function() { emit(this.user_name,1); },

function(key, values) {return Array.sum(values)}, {


query:{status:"active"}, out:"post_total"
}
)

>>db.post_total.find()
>>> db.c1.insertMany([{"post_text": "India is an awesome country","user_name": "sachin",
"status":"active"},
... {"post_text": "welcome to India","user_name": "saurav","status":"active"},
... {"post_text": "I live in India","user_name": "yuvraj","status":"active"},
... {"post_text": "India is great","user_name": "gautam","status":"active"}])
{
"acknowledged" : true,
"insertedIds" : [
ObjectId("59dcdd4a26deb533e8c00fb6"),
ObjectId("59dcdd4a26deb533e8c00fb7"),
ObjectId("59dcdd4a26deb533e8c00fb8"),
ObjectId("59dcdd4a26deb533e8c00fb9")
]
}
>>> db.c1.mapReduce(
... function() { emit(this.user_name,1); },
...
... function(key, values) {return Array.sum(values)}, {
... query:{status:"active"},
... out:"post_total"
... }
... )
{
"result" : "post_total",

Prof. Pallavi M. Tekade


Academic Year 2022-23
Course : SY BTech IT
Course Name : ADBMSL
"timeMillis" : 101,
"counts" : {
"input" : 4,
"emit" : 4,
"reduce" : 0,
"output" : 4
},
"ok" : 1
}
>>> db.post_total.find()
{ "_id" : "gautam", "value" : 1 }
{ "_id" : "sachin", "value" : 1 }
{ "_id" : "saurav", "value" : 1 }
{ "_id" : "yuvraj", "value" : 1 }

Prof. Pallavi M. Tekade

You might also like