Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is UIMA provides only a wrapper or is it like StandfordCore NLP and GATE?

The Standford Core NLP and the GATE provides the various NLP operation like NER, POS tagging. There are some of the NLP operation like Tokenizer, Snowball Stemmer available as a UIMA component. So, Is UIMA comparable with the StandfordCore NLP/GATE or it is to be used to wrap these kind of APIs for the pipeline ?

like image 435
Gaurav Avatar asked Jun 12 '14 14:06

Gaurav


1 Answers

The core UIMA framework does not provide specific NLP tools. It provides means of building and running analytics workflows from UIMA-compliant components. As the data to be analyzed can grow quite large in real-world applications, UIMA has a focus on scalability, offering distributed runtime environments like UIMA-AS or UIMA-DUCC. However, UIMA is not only useful at the large scale, but also for embedding analytics into applications or in the scientific context for building language processing experiments.

There are several collections of UIMA components that provide NLP tools, often wrapping third-party solutions such as OpenNLP, Stanford CoreNLP, etc.:

  • ClearTK - framework for developing statistical NLP components, also includes wrappers for some third-party tools
  • cTAKES - information extraction from electronic medical record clinical free-text
  • DKPro Core - collection of UIMA components for NLP wrapping many third-party tools for UIMA
  • UIMA Addons - small set of components provided by the UIMA team itself
  • U-Compare - integrated text mining/natural language processing system

These are some of the major collections at the time of writing. You may find additional sources of UIMA components if you search for them.

The core UIMA framework is comparable to GATE embedded minus any processing resources that GATE provides out of the box. The UIMA Ruta workbench could be said to be distantly related to the GATE Developer workbench, or more specifically to JAPE.

UIMA does not compare well to Stanford CoreNLP because UIMA does not focus on offering specific NLP components, while CoreNLP does.

NLP tools like CoreNLP tend to be wrapped as UIMA components for use within UIMA pipelines.

Frameworks like GATE are typically not wrapped as UIMA components, but specific NLP tools offered as GATE plugins might be wrapped.

Disclosure: I work on the Apache UIMA project and on the DKPro Core project.

like image 177
rec Avatar answered Dec 03 '22 02:12

rec