Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java regex - erase characters followed by \b (backspace)

Tags:

java

regex

I have a string constructed from user keyboard types, so it might contain '\b' characters (backspaces).

I want to clean the string, so that it will not contain the '\b' characters, as well as the characters they are meant to erase. For instance, the string:

String str = "\bHellow\b world!!!\b\b\b.";

Should be printed as:

Hello world.

I have tried a few things with replaceAll, and what I have now is:

System.out.println(str.replaceAll("^\b+|.\b+", ""));

Which prints:

Hello world!!.

Single '\b' is handled fine, but multiples of it are ignored.

So, can I solve it with Java's regex?

EDIT:

I have seen this answer, but it seem to not apply for java's replaceAll.
Maybe I'm missing something with the verbatim string...

like image 388
Elist Avatar asked May 11 '15 16:05

Elist


People also ask

How do you replace characters with backspace in Java?

Java regex - erase characters followed by \b (backspace)

How do you backspace in regex?

[\b] matches a backspace character apparently.

What does \\ mean in Java regex?

Backslashes in Java. The backslash \ is an escape character in Java Strings. That means backslash has a predefined meaning in Java. You have to use double backslash \\ to define a single backslash. If you want to define \w , then you must be using \\w in your regex.

Is backspace a special character?

A dedicated symbol for "backspace" exists as U+232B ⌫ but its use as a keyboard label is not universal. The backspace is distinct from the delete key, which in paper media for computers would punch out all the holes to strike out a character, and in modern computers deletes text following it.


2 Answers

It can't be done in one pass unless there is a practical limit on the number of consecutive backspaces (which there isn't), and there is a guarantee (which there isn't) that there are no "extra" backspaces for which there is no preceding character to delete.

This does the job (it's only 2 small lines):

while (str.contains("\b"))
    str = str.replaceAll("^\b+|[^\b]\b", "");

This handles the edge case of input like "x\b\by" which has an extra backspace at the start, which should be trimmed once the first one consumes the x, leaving just "y".

like image 54
Bohemian Avatar answered Sep 21 '22 03:09

Bohemian


This looks like a job for Stack!

Stack<Character> stack = new Stack<Character>();

// for-each character in the string
for (int i = 0; i < str.length(); i++) {
    char c = str.charAt(i);

    // push if it's not a backspace
    if (c != '\b') {
        stack.push(c);
    // else pop if possible
    } else if (!stack.empty()) {
        stack.pop();
    }
}

// convert stack to string
StringBuilder builder = new StringBuilder(stack.size());

for (Character c : stack) {
    builder.append(c);
}

// print it
System.out.println(builder.toString());

Regex, while nice, isn't well suited to every task. This approach is not as concise as Bohemian's, but it is more efficient. Using a stack is O(n) in every case, while a regex approach like Bohemian's is O(n2) in the worst case.

like image 22
Luke Willis Avatar answered Sep 23 '22 03:09

Luke Willis