Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java library for text/string processing simular to unix/linux utilities

I'm a java programmer. I use bash scripts a lot for text processing.

Utilities like grep,sed,awk,tr,wc,find, along with piping between commands gives such a powerful combination.

However bash programming lacks portability, testability and the more elegant programming constructs that exist in Java. It also makes it harder to integrate into our other Java products.

I was wondering if anyone knows of any Java text processing libraries out there which might offer what I'm looking for.

It would be so cool to be able to write:

Text.createFromFile("blah.txt).grep("-v","ERROR.*").sed("s/ERROR/blah/g").awk("print $1").writeTo("output.txt")

This might be pie-in-in-the-sky stuff. But thought I'd put the question out there anyway.

like image 204
Ben Avatar asked Jul 13 '11 00:07

Ben


2 Answers

Unix4j implements some basic unix commands, mainly focussing on text-processing (with support for piping between commands): http://www.unix4j.org

Example (Ben's example, but without awk as this is not currently supported):

Unix4j.fromStrings("1:here is no error", "2:ERRORS everywhere", "3:another ERROR", "4:nothing").toFile("blah.txt");
Unix4j.fromFile("blah.txt").grep(Grep.Options.v, "ERROR.*").sed("s/ERROR/blah/g").toFile("output.txt");     
Unix4j.fromFile("output.txt").toStdOut();       

>>>
1:here is no error
4:nothing

Note:

  • the author of the question is involved in the unix4j project
like image 170
marco Avatar answered Nov 01 '22 07:11

marco


Believe it or not, but I used embedded Ant for many of those tasks.


Update

Ant has Java api's that allow it to be called from Java projects. This is embedded mode. This is a reference to And Api 1.6.1. Distribution should include docs as well.

To use it, you would create new task object, set appropriate parameters and execute it just as you would in build.xml but via Java Api. Than you can run your task.

Something like

ReplaceRegExp regexp = new ReplaceRegExp();
regexp.setMatch("bla");
regexp.setFile(new File("inputFile"));
regexp.execute();

You may need to set up some other stuff as well.

Not sure if it solves your problem, but Ant has a lot of code to do things. Just search through docs.

like image 22
Alex Gitelman Avatar answered Nov 01 '22 09:11

Alex Gitelman