Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java API for running UIMA Ruta scripts

Tags:

java

uima

ruta

I am new to UIMA Ruta. I made some annotators using scripting language. I am able to run them within EclipseIDE. I want to write a JAVA API to automatically run scripts on the input provided.

I am using the same example project provided in UIMA documentation.

So far i have been able to do this

    try {
        File taeDescriptor = null;
        File inputDir = null;

        // Read and validate command line arguments
        boolean validArgs = false;
        if (args.length == 2) {
            taeDescriptor = new File(args[0]);
            inputDir = new File(args[1]);

            validArgs = taeDescriptor.exists()
                    && !taeDescriptor.isDirectory()
                    && inputDir.isDirectory();
        }
        if (!validArgs) {
            printUsageMessage();
        } else {
            // get Resource Specifier from XML file
            XMLInputSource in = new XMLInputSource(taeDescriptor);
            ResourceSpecifier specifier = UIMAFramework.getXMLParser()
                    .parseResourceSpecifier(in);

            // for debugging, output the Resource Specifier
            // System.out.println(specifier);

            // create Analysis Engine
            AnalysisEngine ae = UIMAFramework
                    .produceAnalysisEngine(specifier);

            // create a CAS
            CAS cas = ae.newCAS();

            // get all files in the input directory
            File[] files = inputDir.listFiles();
            if (files == null) {
                System.out.println("No files to process");
            } else {
                // process documents
                for (int i = 0; i < files.length; i++) {
                    if (!files[i].isDirectory()) {
                        processFile(files[i], ae, cas);
                    }
                }
            }
            ae.destroy();
        }
    } catch (Exception e) {
        e.printStackTrace();
    }
}

On running above snippet with default BasicEngine.xml and input text file. It gives below stack trace

org.apache.uima.resource.ResourceInitializationException: Annotator class "org.apache.uima.ruta.engine.RutaEngine" was not found. (Descriptor: file:/D:/uimaOutput/ruta-2.1.0/example-projects/ExampleProject/descriptor/BasicEngine.xml)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:209)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:158)
at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:279)
at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:369)
at org.apache.uima.examples.ExampleApplication.main(ExampleApplication.java:81)
Caused by: java.lang.ClassNotFoundException: org.apache.uima.ruta.engine.RutaEngine
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:186)
    at     org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:206)
... 6 more

I am stuck. Please help.

EDIT:

The Java API written above is from uimaj-example from uima documentation. I added some ruta jars and general jars to classpath after which file executed fine.

like image 291
Anshul Avatar asked Apr 22 '14 11:04

Anshul


1 Answers

The problem was already solved in the comments of the question. I just want to extend the answer with pointers.

The library ruta-core.jar and its dependencies (antlr-runtime, uima, uimafit, ...) need to be in the classpath of the application. The eclipse plugin ruta-ep-engine.jar contains the dependencies beside uima. For projects built with maven:

<dependency>
  <groupId>org.apache.uima</groupId>
  <artifactId>ruta-core</artifactId>
  <version>2.2.0</version>
</dependency>

The documentation contains examples on how to call UIMA Ruta scripts from within Java: https://uima.apache.org/d/ruta-current/tools.ruta.book.html#ugr.tools.ruta.ae.basic https://uima.apache.org/d/ruta-current/tools.ruta.book.html#ugr.tools.ruta.integration

For developers that want to create a command line interface, this class might be interesting: https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-ep-ide-ui/src/main/java/org/apache/uima/ruta/ide/launching/RutaLauncher.java

If you are in a UIMA environment (CAS instance is already available), then the method Ruta.apply(CAS cas, String script) can be used for applying some rules on a CAS.

For developers that prefer to use uimaFIT: https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core/src/test/java/org/apache/uima/ruta/engine/UimafitTest.java

like image 53
Peter Kluegl Avatar answered Oct 20 '22 18:10

Peter Kluegl