I'm using neo4j for the first time, neography for Ruby. I have my data in csv files. I can successfully populate the database through my main file, i.e. create all nodes. So, for each csv file (here, user.csv), I'm doing -
def create_person(name, id)
Neography::Node.create("name" => name, "id" => id)
end
CSV.foreach('user.csv', :headers => true) do |row|
id = row[0].to_i()
name = row[1]
$persons[id] = create_person(name, id)
end
Likewise for other files. There are two issues now. Firstly, if my files are very small, then it goes fine, but when files are slightly big, I get (I'm dealing with 4 1MB files) -
SocketError: Too many open files (http://localhost:7474)
Another issue is that I don't want to do this (populate db) every time I run this ruby file. I want to populate the data once and then don't want to touch the database. After that I only want to run queries on it. Can anyone please tell me how to populate it and save it? And then how can I load it whenever I want to use it. Thank you.
Create a @neo client:
@neo = Neography::Rest.new
Create a queue:
@queue = []
Make use of the BATCH api for data loading.
def create_person(name, id)
@queue << [:create_node, {"name" => name, "id" => id}]
if @queue.size >= 500
batch_results = neo.batch *@queue
@queue = []
batch_results.each do |result|
id = result["body"]["self"].split('/').last
$persons[id] = result
end
end
end
Run through you csv file:
CSV.foreach('user.csv', :headers => true) do |row|
create_person(row[1], row[0].to_i)
end
Get the leftovers:
batch_results = @neo.batch *@queue
batch_results.each do |result|
id = result["body"]["self"].split('/').last
$persons[id] = result
end
An example of data loading via the rest api can be seen here => https://github.com/maxdemarzi/neo_crunch/blob/master/neo_crunch.rb
An example of using a queue for writes can be seen here => http://maxdemarzi.com/2013/09/05/scaling-writes/
Sounds as if you run these requests in parallel or don't reuse http connections.
Did you try to do @neo=Neography::Rest.new
and @neo.create_node({...})
I think that one reuses the http connections.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With