Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UIMA Ruta : script for the combination of chars and numbers

Tags:

ruta

I've just started with Ruta and I would like to write a rule that finds any combination of chars, numbers and dot (.) .

(JAVA Regex for it - ([a-z0-9.]+) )

for e.g. -

abcd.03ef0.3abc

03a.bcd.03eeff903a.bc

like image 245
user3778893 Avatar asked Feb 01 '26 17:02

user3778893


1 Answers

Something like the following:

(SW | NUM | PERIOD)+{-> MyType};

or if uppercase chars should also be included:

(W | NUM | PERIOD)+{-> MyType};

change the filtering setting before, if no spaces may occur in between:

Document{-> RETAINTYPE(SPACE,BREAK,MARKUP)};

in order to avoid overlapping matches, you can either use MARKONCE instead of the implicit action, an additional (negated) condition -PARTOF(MyType), or change the matching strategy with GREEDYANCHORING.

like image 189
Peter Kluegl Avatar answered Feb 03 '26 08:02

Peter Kluegl