Handling big data sets (neo4j, mongo db, hadoop)

Question

I'm looking for best practices to handle data. So, that is what I got so far: 1.000.000 nodes of type "A". Every "A" node can be connected to 1-1000 nodes of type "B" and 1-10 nodes of type "C".

I've written a RESTful service (Java, Jersey) to import data into a neo4j graph. After the import of nodes "A" (only the nodes, with ids, no further data) i have notices that the neo4j db has grown to ~2.4GB.

Is it a good idea to store additional fields (name, description,...) in neo4j? Or should i set up a mongoDB/hadoop to use a key/value combination for data access?

Michael Hunger · Accepted Answer

Did you delete a lot of nodes during the insert? Normally a node takes 9 bytes on disk, so your 1M nodes should just take 9M bytes. You have to enable id reuse to aggressively reclaim memory.

Could you please list the content of your data directory with the file sizes?

In general it is no issue to put your other fields in neo4j if they are not large blob fields.

How did you create the db?

Handling big data sets (neo4j, mongo db, hadoop)

Tags:

mongodb

hadoop

neo4j

Alebon

1 Answers

Michael Hunger

Recent Activity

Donate For Us

Handling big data sets (neo4j, mongo db, hadoop)

Tags:

mongodb

hadoop

neo4j

Alebon

1 Answers

Michael Hunger

Related questions

Recent Activity

Donate For Us