Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to strip color codes used by mIRC users?

Tags:

python

irc

I'm writing a IRC bot in Python using irclib and I'm trying to log the messages on certain channels.
The issue is that some mIRC users and some Bots write using color codes.
Any idea on how i could strip those parts and leave only the clear ascii text message?

like image 700
daniels Avatar asked Jun 09 '09 14:06

daniels


People also ask

How does mIRC handle color in text messages?

The Control+O key combination in mIRC inserts ascii character 15, which turns off all previous attributes, including color, bold, underline, and italics. Technically mIRC accepts the full number range 0 to 99. Thus N and M can maximally be two digits long. The way these colors are interpreted varies from client to client.

How do I insert a color code in mIRC?

The color codes in mIRC are inserted by using the Control+K key combination. The actual control character inserted in the text is ascii character 3, seen as ^C or inverse C on most UNIX clients. N and M can be any number out of a range 0 to 15 thus allowing a range of sixteen colors.

How to turn off all previous color attributes in mIRC?

A plain ^C can be used to turn off all previous color attributes. The Control+O key combination in mIRC inserts ascii character 15, which turns off all previous attributes, including color, bold, underline, and italics. Technically mIRC accepts the full number range 0 to 99.

How does mIRC work with ASCII characters?

The Control+O key combination in mIRC inserts ascii character 15, which turns off all previous attributes, including color, bold, underline, and italics. Technically mIRC accepts the full number range 0 to 99. Thus N and M can maximally be two digits long.


4 Answers

Regular expressions are your cleanest bet in my opinion. If you haven't used them before, this is a good resource. For the full details on Python's regex library, go here.

import re
regex = re.compile("\x03(?:\d{1,2}(?:,\d{1,2})?)?", re.UNICODE)

The regex searches for ^C (which is \x03 in ASCII, you can confirm by doing chr(3) on the command line), and then optionally looks for one or two [0-9] characters, then optionally followed by a comma and then another one or two [0-9] characters.

(?: ... ) says to forget about storing what was found in the parenthesis (as we don't need to backreference it), ? means to match 0 or 1 and {n,m} means to match n to m of the previous grouping. Finally, \d means to match [0-9].

The rest can be decoded using the links I refer to above.

>>> regex.sub("", "blabla \x035,12to be colored text and background\x03 blabla")
'blabla to be colored text and background blabla'

chaos' solution is similar, but may end up eating more than a max of two numbers and will also not remove any loose ^C characters that may be hanging about (such as the one that closes the colour command)

like image 140
Smerity Avatar answered Sep 30 '22 09:09

Smerity


The second-rated and following suggestions are defective, as they look for digits after whatever character, but not after the color code character.

I have improved and combined all posts, with the following consequences:

  • we do remove the reverse character
  • remove color codes without leaving digits in the text.

Solution:

regex = re.compile("\x1f|\x02|\x12|\x0f|\x16|\x03(?:\d{1,2}(?:,\d{1,2})?)?", re.UNICODE)

like image 22
frederik Avatar answered Sep 30 '22 09:09

frederik


As I found this question useful, I figured I'd contribute.

I added a couple things to the regex

regex = re.compile("\x1f|\x02|\x03|\x16|\x0f(?:\d{1,2}(?:,\d{1,2})?)?", re.UNICODE)

\x16 removed the "reverse" character. \x0f gets rid of another bold character.

like image 42
Xorlev Avatar answered Sep 30 '22 09:09

Xorlev


AutoDl-irssi had a very good one written in perl, here it is in python:

def stripMircColorCodes(line) : line = re.sub("\x03\d\d?,\d\d?","",line) line = re.sub("\x03\d\d?","",line) line = re.sub("[\x01-\x1F]","",line) return line

like image 22
sparks Avatar answered Sep 30 '22 08:09

sparks