Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Updating a property of all the nodes in neo4j db resulted in out of memory

Tags:

neo4j

cypher

My graph db has 3.5 million nodes size of the database is 1.6GB i am trying to update a property for all the nodes via neo4jshell with following query.

Match (p:Person) set p.regId= toInt(p.regId) ;

Before doing this i have added index on Person for property regId. During the execution the following error was thrown

java.lang.MemoryError: GC overhead limit exceeded

like image 282
ger Avatar asked Jul 01 '15 10:07

ger


People also ask

What are the weaknesses of Neo4j?

Additionally, Neo4j has scalability weaknesses related to scaling writes, hence if your application is expected to have very large write throughputs, then Neo4j is not for you.

Is Neo4j in memory database?

Sorry, no, all current versions of Neo4j use on-disk storage for durability, though with a high enough page cache to encompass the entire graph your reads at least will be reading all from memory rather than hitting disk.

What can you do to improve the performance of Neo4j?

The size of the available heap memory is an important aspect for the performance of Neo4j. Generally speaking, it is beneficial to configure a large enough heap space to sustain concurrent operations. For many setups, a heap size between 8G and 16G is large enough to run Neo4j reliably.

What type of properties can be stored in a full text schema index Neo4j?

Full-text indexes are powered by the Apache Lucene indexing and search library, and can be used to index nodes and relationships by string properties. A full-text index allows you to write queries that match within the contents of indexed string properties.


2 Answers

All changes performed by a single Cypher statement are executed in the same transaction. A transaction builds up in memory and gets persisted when you close it.

I guess your transaction here grows to large and therefore resulting in a memory error.

The usual strategy to deal with this is to use LIMIT on the cypher statement to have a defined size, report back the number of changes done and run the statement x times until the return value is 0.

In your case:

Match (p:Person) 
where p.regId <> toInt(p.regId)
with p limit 10000
set p.regId= toInt(p.regId) 
return count(p)
like image 109
Stefan Armbruster Avatar answered Sep 28 '22 12:09

Stefan Armbruster


Here is a description of what's causing the error. Basically, you're short on memory, an garbage collection isn't finding you any extra free memory.

In the neo4j performance tuning guide there's a lot of guidance on how to tweak memory.

The first thing to try is to give your JVM more memory; for the shell you need to set something like JAVA_OPTS=-Xmx1024m before starting the shell to tweak how much memory the JVM can use, this increases the heap size.

like image 28
FrobberOfBits Avatar answered Sep 28 '22 12:09

FrobberOfBits