Is there a possibility to update data in mongodb with some sort of script? I don't want to (can't) access the mongo shell - but would like to perform the mongoshell update queries. My data is a csv file. I use hadoop for the analysis of the data (extraction and transformation). I need to get the data back in to mongodb and update some attributes. As reference for the update I would like to use the generated id
Can this task be done?
any help would be very appreciated
You want to read data from a CSV file and import into mongodb? You could generate a script file (javascript) and use the mongo shell to execute it like described in "scripting the shell".
Example session, test database, starting with an empty foo collection:
> db.foo.insert({name : "james", position : "forward"})
> db.foo.find()
{ "_id" : ObjectId("4f0c99f6cb435f1e7d7fedea"), "name" : "james", "position" : "forward" }
>
then you generate your script let's say mongo_scripting.js:
db.foo.insert({name : "wade", position : "guard"});
db.foo.update({name : "james"}, {$set : {position : "power forward"}}, false, true);
and running the script:
mongo localhost:27017/test mongo_scripting.js
going back to mongo:
> db.foo.find()
{ "_id" : ObjectId("4f0c99f6cb435f1e7d7fedea"), "name" : "james", "position" : "power forward" }
{ "_id" : ObjectId("4f0c9a64a4a4642bae6459ea"), "name" : "wade", "position" : "guard" }
>
you see that one document got updated and one new inserted.
An alternative is to use the java/python... driver to load the data.
if you can connect to MongoDB at all, then you can surely use the shell. Just run the shell on your local machine and tell it to connect to the remote Mongo instance, like:
mongo dbserver.mydomain.com/foo
You can also consider using mongoimport, http://www.mongodb.org/display/DOCS/Import+Export+Tools , although mongoimport will want to create or replace whole documents, not update fields within documents as you've asked.
It sounds to me like you'll need to write a script to process each line of the CSV and update documents in MongoDB. In Python, that script would go something like:
import csv, pymongo, sys
foo_db = pymongo.Connection("dbserver.mydomain.com").foo
csv_reader = csv.reader(open(sys.argv[1], 'rb'), delimiter=',', quotechar='"')
for line in csv_reader:
_id, field1, field2 = line
foo_db.my_collection.update({
"_id": _id
}, {
"$set": { "field1": field1, "field2": field2 }
}, safe=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With