Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove ANSI control chars (VT100) from a Java String

I am working with automation and using Jsch to connect to remote boxes and automate some tasks.

I am having problem parsing the command results because sometimes they come with ANSI Control chars.

I've already saw this answer and this other one but it does not provide any library to do that. I don't want to reinvent the wheel, if there is any. And I don't feel confident with those answers.

Right now, I am trying this, but I am not really sure it's complete enough.

reply = reply.replaceAll("\\[..;..[m]|\\[.{0,2}[m]|\\(Page \\d+\\)|\u001B\\[[K]|\u001B|\u000F", "");

How to remove ANSI control chars (VT100) from a Java String?

like image 635
Leo Avatar asked Aug 07 '14 18:08

Leo


1 Answers

Most ANSI VT100 sequences have the format ESC [, optionally followed by a number or by two numbers separated by ;, followed by some character that is not a digit or ;. So something like

reply = reply.replaceAll("\u001B\\[[\\d;]*[^\\d;]","");

or

reply = reply.replaceAll("\\e\\[[\\d;]*[^\\d;]","");  // \e matches escape character

should catch most of them, I think. There may be other cases that you could add individually. (I have not tested this.)

Some of the alternatives in the regex you posted start with \\[, rather than the escape character, which may mean that you could be deleting some text you're not supposed to delete, or deleting part of a control sequence but leaving the ESC character in.

like image 112
ajb Avatar answered Sep 17 '22 00:09

ajb