Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 272 Vote(s) - 3.58 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to stop insertion of Duplicate documents in a mongodb collection

#1
Let us have a `MongoDB` collection which has three docs..

db.collection.find()

{ _id:'...', user: 'A', title: 'Physics', Bank: 'Bank_A' }
{ _id:'...', user: 'A', title: 'Chemistry', Bank: 'Bank_B' }
{ _id:'...', user: 'B', title: 'Chemistry', Bank: 'Bank_A' }

We have a doc,

doc = { user: 'B', title: 'Chemistry', Bank:'Bank_A' }

If we use

db.collection.insert(doc)
here, this duplicate doc will get inserted in database.

{ _id:'...', user: 'A', title: 'Physics', Bank: 'Bank_A' }
{ _id:'...', user: 'A', title: 'Chemistry', Bank: 'Bank_B' }
{ _id:'...', user: 'B', title: 'Chemistry', Bank: 'Bank_A' }
{ _id:'...', user: 'B', title: 'Chemistry', Bank: 'Bank_A' }

How this duplicate can be stopped. On which field should indexing be done or any other approach?
Reply

#2
Don't use insert.

Use [update with `upsert=true`][1]. Update will look for the document that matches your query, then it will modify the fields you want and then, you can tell it upsert:True if you want to insert if no document matches your query.


db.collection.update(
<query>,
<update>,
{
upsert: <boolean>,
multi: <boolean>,
writeConcern: <document>
}
)



So, for your example, you could use something like this:

db.collection.update(doc, doc, {upsert:true})




[1]:

[To see links please register here]

Reply

#3
It has been updated from the above answers.

please use `db.collection.updateOne()` instead of `db.collection.update()`.
and also `db.collection.createIndexes()` instead of `db.collection.ensureIndex()`

Update:
the methods update() and ensureIndex() has been deprecated from mongodb 2.*, you can see more details in [mongo][1] and the path is `./mongodb/lib/collection.js`.
For `update()`, the recommend methods are `updateOne, updateMany, or bulkWrite`.
For `ensureIndex()`, the recommend method is `createIndexes`.






[1]:

[To see links please register here]

Reply

#4
You should use a compound index on the set of fields that uniquely identify a document within your MongoDB collection. For example, if you decide that the combination of user, title and Bank are your unique key you would issue the following command:

db.collection.createIndex( { user: 1, title: 1, Bank: 1 }, {unique:true} )

Please note that this should be done after you have removed previously stored duplicates.

[To see links please register here]


[To see links please register here]

Reply

#5
Maybe this is a bit slower than other ways but it works too. It can be used inside a loop:

db.collection.replaceOne(query, data, {upsert: true})

The query may be something like:

{ _id: '5f915390950f276680720b57' }

*https://docs.mongodb.com/manual/reference/method/db.collection.replaceOne*
Reply

#6
What you are looking for is the `AddToSet` instead of `Push` or `Insert`.
Using the `Upsert` flag dosen't seem to work for me.

ie: `var updateSet = Builders<T>.Update.AddToSet(collectionField, value);`

Note that `AddToSet` seems to do a value comparison.
Reply

#7
setting your document's _id key to be the unique identifier and using collection.insert_many(documents, ordered=False) will both allow you to bulk insert and simultaneously prevent duplicates.

eg.

documents = [{'_id':'hello'}, {'_id':'world'}, {'_id':'hello'}]

collection.insert_many(documents, ordered=False)

ordered=False is important. according to the documentation, if ordered=True then mongo will stop attempting to insert if it encounters an duplicate _id. If ordered=False, mongo will attempt to insert all documents.
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through