Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TreeTagger in R

I have downloaded TreeTaggerv3.2 for Windows and have configured it per the install.txt. I am trying to use it in R with koRpus package. I have set the kRp.env as -

set.kRp.env(TT.cmd="C:\\TreeTagger\\bin\\tag-english.bat", lang="en", 
   preset="en", treetagger="manual", format="file", 
    TT.tknz=TRUE, encoding="UTF-8" )

.My data to be tagged is in a file and trying to use it as treetag("myfile.txt") but it is throwing the error-

Error in matrix(unlist(strsplit(tagged.text, "\t")), ncol = 3, byrow = TRUE, : 'data' must be of a vector type, was 'NULL'

In addition: Warning message: running command 'C:\windows\system32\cmd.exe /c C:\TreeTagger\bin\tag-english.bat

C:\Users\vivsingh\Desktop\NLP\tree_tag_ex.txt' had status 255

The standalone TreeTagger is working on by windows.Any idea on how it works?

like image 210
vivsingh Avatar asked Apr 10 '26 10:04

vivsingh


2 Answers

I had the exact same error and warning while trying lemmatization on R word vector following Bernhard Learns blog using windows 7 and R 3.4.1 (x64). The issue was also appearing using textstem package but TreeTagger was running properly in cmd window.

I mixed several answers I found on this post and here is my steps and code running properly:

get into R win_library (~\Documents\R\win-library\3.4\rJava\jri\x64\jri.dll) and copy jri.dll (thanks kravi!) to replace it the parent folder.

close and restart R

library(koRpus)

set.kRp.env(TT.cmd="C:\\TreeTagger\\bin\\tag-english.bat", lang="en", preset="en", treetagger="manual", format="file", TT.tknz=TRUE, encoding="UTF-8")
lemma_tagged <- treetag(lemma_unique$word_clean, treetagger="manual", format="obj", TT.tknz=FALSE , lang="en", TT.options=list(path="c:/TreeTagger", preset="en"))
lemma_tagged_tbl <- tbl_df([email protected])

Hope it helps.

like image 69
Xochitl C. Avatar answered Apr 12 '26 00:04

Xochitl C.


I am posting this answer to keep a record. I also faced the same issue due to incorrect specification of the location of jri.dll on 64-Bit processor and windows 8.1. If we call set.kRp.env(TT.cmd="manual", lang="en", TT.options=list(path="/path/to/tree-tagger-windows-x.x/TreeTagger", preset="en")) and we follow either of following two steps, we can resolve this error:

  1. While installing R, if we install only 64 Bit version of R, and specify the proper path for these variables

    LD_LIBRARY_PATH = /path/to/rJava/jri
    JAVA_HOME = /path/to/jdk1.x.x
    java.library.path = /path/to/rJava/jri/jri.dll
    CLASSPATH = /path/to/rJava/jri

  2. If we already installed both versions viz. 32 bit and 64 bit of R on your computer then just copy jri.dll from /path/to/rJava/jri/x64/jri.dll and replace at path/to/rJava/jri/jri.dll. Further, we need to set the path of above mentioned four variables.

like image 32
kravi Avatar answered Apr 12 '26 01:04

kravi



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!