Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fuseki GC overhead limit exceeded during data import

I'm trying to import LinkedMDB (6.1m triples) into my local version of jena-fuseki at startup:

/path/to/fuseki-server --file=/path/to/linkedmdb.nt /ds

and that runs for a minute, then dies with the following error:

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
    at com.hp.hpl.jena.graph.Node$3.construct(Node.java:318)
    at com.hp.hpl.jena.graph.Node.create(Node.java:344)
    at com.hp.hpl.jena.graph.NodeFactory.createURI(NodeFactory.java:48)
    at org.apache.jena.riot.system.RiotLib.createIRIorBNode(RiotLib.java:80)
    at org.apache.jena.riot.system.ParserProfileBase.createURI(ParserProfileBase.java:107)
    at org.apache.jena.riot.system.ParserProfileBase.create(ParserProfileBase.java:156)
    at org.apache.jena.riot.lang.LangNTriples.tokenAsNode(LangNTriples.java:97)
    at org.apache.jena.riot.lang.LangNTriples.parseOne(LangNTriples.java:90)
    at org.apache.jena.riot.lang.LangNTriples.runParser(LangNTriples.java:54)
    at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
    at org.apache.jena.riot.RDFParserRegistry$ReaderRIOTFactoryImpl$1.read(RDFParserRegistry.java:142)
    at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:818)
    at org.apache.jena.riot.RDFDataMgr.parse(RDFDataMgr.java:679)
    at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:211)
    at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:104)
    at org.apache.jena.fuseki.FusekiCmd.processModulesAndArgs(FusekiCmd.java:251)
    at arq.cmdline.CmdArgModule.process(CmdArgModule.java:51)
    at arq.cmdline.CmdMain.mainMethod(CmdMain.java:100)
    at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
    at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
    at org.apache.jena.fuseki.FusekiCmd.main(FusekiCmd.java:141)

Is there a way that I can bump up the memory limit or import the data in less intensive way?

For comparison's sake, when I used a 1million triple source file, it imports in less than 10 seconds.

like image 414
Kristian Avatar asked Jan 17 '14 22:01

Kristian


People also ask

How do I fix GC overhead limit exceeded in eclipse?

From the root of the Eclipse folder open the eclipse. ini and change the default maximum heap size of -Xmx256m to -Xmx1024m on the last line. NOTE: If there is a lot of memory available on the machine, you can also try using -Xmx2048m as the maximum heap size.

What is GC overhead limit?

GC Overhead Limit Exceeded Error It's thrown by the JVM when it encounters a problem related to utilizing resources. More specifically, the error occurs when the JVM spent too much time performing Garbage Collection and was only able to reclaim very little heap space.


2 Answers

Increase heap memory, java -Xmx2048M -jar fuseki-sys.jar ......

open fuseki-server with an editor you'll find the line JVM_ARGS=${JVM_ARGS:--Xmx1200M} modify it to JVM_ARGS=${JVM_ARGS:--Xmx2048M}

like image 136
Yazan Jaber Avatar answered Sep 28 '22 08:09

Yazan Jaber


Set JVM_ARGS when using the fuseki-server script.

Also note that --file=... is reading the file into memory. Maybe this is too big for handling that way. If so, load into TDB and use a TDB database with Fuseki.

like image 27
AndyS Avatar answered Sep 28 '22 09:09

AndyS