I'm trying to import LinkedMDB (6.1m triples) into my local version of jena-fuseki at startup:
/path/to/fuseki-server --file=/path/to/linkedmdb.nt /ds
and that runs for a minute, then dies with the following error:
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at com.hp.hpl.jena.graph.Node$3.construct(Node.java:318)
at com.hp.hpl.jena.graph.Node.create(Node.java:344)
at com.hp.hpl.jena.graph.NodeFactory.createURI(NodeFactory.java:48)
at org.apache.jena.riot.system.RiotLib.createIRIorBNode(RiotLib.java:80)
at org.apache.jena.riot.system.ParserProfileBase.createURI(ParserProfileBase.java:107)
at org.apache.jena.riot.system.ParserProfileBase.create(ParserProfileBase.java:156)
at org.apache.jena.riot.lang.LangNTriples.tokenAsNode(LangNTriples.java:97)
at org.apache.jena.riot.lang.LangNTriples.parseOne(LangNTriples.java:90)
at org.apache.jena.riot.lang.LangNTriples.runParser(LangNTriples.java:54)
at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
at org.apache.jena.riot.RDFParserRegistry$ReaderRIOTFactoryImpl$1.read(RDFParserRegistry.java:142)
at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:818)
at org.apache.jena.riot.RDFDataMgr.parse(RDFDataMgr.java:679)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:211)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:104)
at org.apache.jena.fuseki.FusekiCmd.processModulesAndArgs(FusekiCmd.java:251)
at arq.cmdline.CmdArgModule.process(CmdArgModule.java:51)
at arq.cmdline.CmdMain.mainMethod(CmdMain.java:100)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
at org.apache.jena.fuseki.FusekiCmd.main(FusekiCmd.java:141)
Is there a way that I can bump up the memory limit or import the data in less intensive way?
For comparison's sake, when I used a 1million triple source file, it imports in less than 10 seconds.
From the root of the Eclipse folder open the eclipse. ini and change the default maximum heap size of -Xmx256m to -Xmx1024m on the last line. NOTE: If there is a lot of memory available on the machine, you can also try using -Xmx2048m as the maximum heap size.
GC Overhead Limit Exceeded Error It's thrown by the JVM when it encounters a problem related to utilizing resources. More specifically, the error occurs when the JVM spent too much time performing Garbage Collection and was only able to reclaim very little heap space.
Increase heap memory, java -Xmx2048M -jar fuseki-sys.jar ......
open fuseki-server with an editor you'll find the line JVM_ARGS=${JVM_ARGS:--Xmx1200M}
modify it to JVM_ARGS=${JVM_ARGS:--Xmx2048M}
Set JVM_ARGS
when using the fuseki-server
script.
Also note that --file=...
is reading the file into memory. Maybe this is too big for handling that way. If so, load into TDB and use a TDB database with Fuseki.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With