Thanks to visit codestin.com
Credit goes to www.tutorialspoint.com

Find Duplicate Records in MongoDB



You can use the aggregate framework to find duplicate records in MongoDB. To understand the concept, let us create a collection with the document. The query to create a collection with a document is as follows −

> db.findDuplicateRecordsDemo.insertOne({"StudentFirstName":"John"});
{
   "acknowledged" : true,
   "insertedId" : ObjectId("5c8a330293b406bd3df60e01")
}
> db.findDuplicateRecordsDemo.insertOne({"StudentFirstName":"John"});
{
   "acknowledged" : true,
   "insertedId" : ObjectId("5c8a330493b406bd3df60e02")
}
> db.findDuplicateRecordsDemo.insertOne({"StudentFirstName":"Carol"});
{
   "acknowledged" : true,
   "insertedId" : ObjectId("5c8a330c93b406bd3df60e03")
}
> db.findDuplicateRecordsDemo.insertOne({"StudentFirstName":"Sam"});
{
   "acknowledged" : true,
   "insertedId" : ObjectId("5c8a331093b406bd3df60e04")
}
> db.findDuplicateRecordsDemo.insertOne({"StudentFirstName":"Carol"});
{
   "acknowledged" : true,
   "insertedId" : ObjectId("5c8a331593b406bd3df60e05")
}
> db.findDuplicateRecordsDemo.insertOne({"StudentFirstName":"Mike"});
{
   "acknowledged" : true,
   "insertedId" : ObjectId("5c8a331e93b406bd3df60e06")
}

Display all documents from a collection with the help of find() method. The query is as follows −

> db.findDuplicateRecordsDemo.find();

The following is the output −

{ "_id" : ObjectId("5c8a330293b406bd3df60e01"), "StudentFirstName" : "John" }
{ "_id" : ObjectId("5c8a330493b406bd3df60e02"), "StudentFirstName" : "John" }
{ "_id" : ObjectId("5c8a330c93b406bd3df60e03"), "StudentFirstName" : "Carol" }
{ "_id" : ObjectId("5c8a331093b406bd3df60e04"), "StudentFirstName" : "Sam" }
{ "_id" : ObjectId("5c8a331593b406bd3df60e05"), "StudentFirstName" : "Carol" }
{ "_id" : ObjectId("5c8a331e93b406bd3df60e06"), "StudentFirstName" : "Mike" }

Here is the query to find duplicate records in MongoDB −

> db.findDuplicateRecordsDemo.aggregate(
   ... {"$group" : { "_id": "$StudentFirstName", "count": { "$sum": 1 } } },
   ... {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } },
   ... {"$project": {"StudentFirstName" : "$_id", "_id" : 0} }
... );

The following is the output displaying only the duplicate records −

{ "StudentFirstName" : "Carol" }
{ "StudentFirstName" : "John" }
Updated on: 2019-07-30T22:30:25+05:30

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements