Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python regex to match VT100 escape sequences

I'm writing a Python program that logs terminal interaction (similar to the script program), and I'd like to filter out the VT100 escape sequences before writing to disk. I'd like to use a function like this:

def strip_escapes(buf):
    escape_regex = re.compile(???) # <--- this is what I'm looking for
    return escape_regex.sub('', buf)

What should go in escape_regex?

like image 323
Lorin Hochstein Avatar asked Oct 22 '11 04:10

Lorin Hochstein


People also ask

How to escape in RegEx Python?

Python also uses backslash ( \ ) for escape sequences (i.e., you need to write \\ for \ , \\d for \d ), but it supports raw string in the form of r'...' , which ignore the interpretation of escape sequences - great for writing regex.

How can I remove the ANSI escape sequences from a string in Python?

You can use regexes to remove the ANSI escape sequences from a string in Python. Simply substitute the escape sequences with an empty string using re. sub(). The regex you can use for removing ANSI escape sequences is: '(\x9B|\x1B\[)[0-?]

What is d RegEx?

The RegExp \D Metacharacter in JavaScript is used to search non digit characters i.e all the characters except digits. It is same as [^0-9]. Syntax: /\D/

What is RegEx function in Python?

A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern. RegEx can be used to check if a string contains the specified search pattern.


1 Answers

The combined expression for escape sequences can be something generic like this:

(\x1b\[|\x9b)[^@-_]*[@-_]|\x1b[@-_]

Should be used with re.I

This incorporates:

  1. Two-byte sequences, i.e. \x1b followed by a character in the range of @ until _.
  2. One-byte CSI, i.e. \x9b as opposed to \x1b + "[".

However, this will not work for sequences that define key mappings or otherwise included strings wrapped in quotes.

like image 68
Ja͢ck Avatar answered Sep 24 '22 15:09

Ja͢ck