I am using IKVM to port some java libraries into my c# project. The library api (StanfordNLP) requires that a file be loaded to train the statistical models used by the nlp functions. Loading the file from the file system has worked well for weeks, but I would now like to add the file as an embedded resource in a dll, rather than retrieving it from the file system.
The problem is that the java api is not finding the .net embedded resource.
Here is a snippet of code that works when retrieving a file from the file system:
public class SNLPModel
{
public LexicalizedParser LP;
public SNLPModel()
{
// Using a relative file path in the target build directory
LP = LexicalizedParser.loadModel("models-stanford\\englishPCFG.ser.gz");
}
}
But when I make the "englishPCFG.ser.gz" file an embedded resource in visual studio (using VS2012), and change the code to match:
public class SNLPModel
{
public LexicalizedParser LP;
public SNLPModel()
{
// Using this line of code to verify that the file is being loaded as
// an embedded resource. Running in debug, I have verified that it is, and
// noted its complete name.
string[] s = System.Reflection.Assembly.GetExecutingAssembly()
.GetManifestResourceNames();
java.io.InputStream modelFile = java.lang.ClassLoader
.getSystemResourceAsStream
("FeatureExtraction.StanfordNLP_Models.englishPCFG.ser.gz");
java.io.ObjectInputStream x = new java.io.ObjectInputStream(modelFile);
LP = LexicalizedParser.loadModel(x);
}
}
the InputStream object, modelFile, is always returned null. I have tried various forms of the resource string, replaceing the first two dots (".") with forward slash ("/"), backslash ("\") and double backslash ("\\"). I am beginning to suspect that java.io cannot access the .net resource. It would not be surprising that the java api does not recognize the .net resources, but I thought IKVM might provide a bridge. I have seen a reference to something called IKVM.Internals.VirtualFileSystem, but only the one reference (http://old.nabble.com/Manual-by-name-embedded-resource-lookup--td31162421.html) and haven't found any IKVM dlls that actually contain the class.
Any help would be much appreciated. I am using:
c#.NET 4.5
Visual Studio 2012
Latest Stanford NLP java libraries
IKVM 7.0.4335.0
There is no automatic support for this in IKVM, but it is really easy to do it yourself:
var asm = Assembly.GetExecutingAssembly();
var stream = asm.GetManifestResourceStream("FeatureExtraction.StanfordNLP_Models.englishPCFG.ser.gz");
var inp = new ikvm.io.InputStreamWrapper(stream);
var x = new java.io.ObjectInputStream(inp);
The hint to the right solution is in the extension of the file that you embed.
If it is plain serialized java object (like english-left3words-distsim.tagger
) you can load it like this
let model = "../english-left3words-distsim.tagger"
use fs = new FileStream(model, FileMode.Open)
use isw = new ikvm.io.InputStreamWrapper(fs)
let tagger = edu.stanford.nlp.tagger.maxent.MaxentTagger(isw)
but if your model has extension .gz
this mean that file is gzipped, and you have to wrap input stream into java.util.zip.GZIPInputStream
before deserialization
let model = "../englishRNN.ser.gz"
use fs = new FileStream(model, FileMode.Open)
use isw = new ikvm.io.InputStreamWrapper(fs)
use ois =
if model.EndsWith(".gz")
then
let gzs = new java.util.zip.GZIPInputStream(isw)
new java.io.ObjectInputStream(gzs)
else new java.io.ObjectInputStream(isw)
let lp = edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(ois)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With