Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use HeidelTime temporal tagger inside a Java project?

I would like to automatically identify dates inside a stream of documents and in this sense I would like to use the code provided by the open source project Heideltime, available here (https://code.google.com/p/heideltime/). I have installed the Heideltime kit (not the standalone version) and now I am wondering how can I reference it and call it inside my Java project. I have already added a dependecy to Heideltime inside my pom.xml:

    <dependency>
        <groupId>de.unihd.dbs</groupId>
        <artifactId>heideltime</artifactId>
        <version>1.7</version>
    </dependency>

however I am not sure how to call the classes from this source project into my own project. I am using Maven for both. Anyone who has used it before could maybe give me a suggestion or piece of advice? Many thanks!

like image 343
Crista23 Avatar asked Oct 19 '22 21:10

Crista23


2 Answers

heideltime-kit is itself a Maven project. So, you can add the heideltime-kit project as a dependency. (In Netbeans, right click on Dependencies, --> Add Dependency --> Open Projects (make sure the project is open first) --> HeidelTime)

Then move the config.props file into your project's src/main/resources folder. Set the path to treetagger within config.props.

As far as using the classes goes, you'll want to create an instance of HeidelTimeStandalone (see de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone.java) using POSTagger.TREETAGGER as the posTagger parameter and a hardcoded path to your src/main/resources/config.props file as the configPath parameter. For example,

heidelTime = new HeidelTimeStandalone(Language.ENGLISH,
                                      DocumentType.COLLOQUIAL,
                                      OutputType.TIMEML,
                                      "path/to/config.props",
                                      POSTagger.TREETAGGER, true);

Then to use HeidelTime to process text, you can simply call the process function:

String result = heidelTime.process(text, date);
like image 96
jgloves Avatar answered Oct 23 '22 21:10

jgloves


Adding to the reply from jgloves, you might be interested to parse the Heideltime result string into a Java object representation. The following code transforms the Uima-XML representation into Timex3 objects.

    HeidelTimeStandalone time = new HeidelTimeStandalone(Language.GERMAN, DocumentType.SCIENTIFIC, OutputType.XMI, "config.props", POSTagger.STANFORDPOSTAGGER);
    String xmiRepresentation = time.process(document, documentCreationTime); //Apply Heideltime and get the XML-UIMA representation     
    JCas cas = jcasFactory.createJCas();

    for(FSIterator<Annotation> it= cas.getAnnotationIndex(Timex3.type).iterator(); it.hasNext(); ){
            System.out.printkn(it.next);
    }
like image 40
user3776894 Avatar answered Oct 23 '22 20:10

user3776894