Mongo DB Notes
Mongo DB Notes
Mongo DB Notes
Course : SY BTech IT
Course Name : ADBMSL
(Assignment No 4)
Map reduce
Map-reduce is a data processing paradigm for condensing large volumes of data into useful
aggregated results. For map-reduce operations, MongoDB provides the mapReduce database
command.
Map-reduce supports operations on sharded collections, both as an input and as an output.
Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to
support deployments with very large data sets and high throughput operations.
Consider the following map-reduce operation:
> db.orders.find()
{ "_id" : ObjectId("59cde47a1ea6abd9870348f9"), "cust_id" : "A123", "amount" : 500, "status" :
"A" }
{ "_id" : ObjectId("59cde47a1ea6abd9870348fa"), "cust_id" : "A123", "amount" : 250, "status" : "A" }
{ "_id" : ObjectId("59cde47a1ea6abd9870348fb"), "cust_id" : "B212", "amount" : 200, "status" : "A" }
{ "_id" : ObjectId("59cde47a1ea6abd9870348fc"), "cust_id" : "A123", "amount" : 300, "status" :
"D" }
>db.orders_totals.find()
Now, we will use a mapReduce function on our c1 collection to select all the active posts, group
them on the basis of user_name and then count the number of posts by each user using the
following code.
>>db.c1.mapReduce(
function() { emit(this.user_name,1); },
>>db.post_total.find()
>>> db.c1.insertMany([{"post_text": "India is an awesome country","user_name": "sachin",
"status":"active"},
... {"post_text": "welcome to India","user_name": "saurav","status":"active"},
... {"post_text": "I live in India","user_name": "yuvraj","status":"active"},
... {"post_text": "India is great","user_name": "gautam","status":"active"}])
{
"acknowledged" : true,
"insertedIds" : [
ObjectId("59dcdd4a26deb533e8c00fb6"),
ObjectId("59dcdd4a26deb533e8c00fb7"),
ObjectId("59dcdd4a26deb533e8c00fb8"),
ObjectId("59dcdd4a26deb533e8c00fb9")
]
}
>>> db.c1.mapReduce(
... function() { emit(this.user_name,1); },
...
... function(key, values) {return Array.sum(values)}, {
... query:{status:"active"},
... out:"post_total"
... }
... )
{
"result" : "post_total",