Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Displaying Control character [SOH] as blank space or else in Notepad++

Tags:

notepad++

I am debugging/monitoring a log file containing extensively the control character [SOH]
This makes logs barely unreadable (well, to me, on NP++, but it has to be this way as this character is of some use in the protocol I am monitoring)

How do I display that character in a more friendly fashion on NP++ ?

EDIT : Replacement is not an option as I just want to tail the file, not edit it.

like image 299
Mehdi LAMRANI Avatar asked Jan 10 '12 14:01

Mehdi LAMRANI


People also ask

What is Soh notepad?

SOH - Start of Header - First character of a message header. GS - Group Separator - Can be used as delimiters to mark fields of data structures.

How do you show special characters in notepad?

Go to View Menu > Select Show Symbol > Select Show All Characters . It displays all hidden characters in the opened file.

How do you find the control characters in a text file?

To look for any control character Both grep and sed can search for a complemented character class/range, which will find lines containing any character that is not a 'printable' (graphic or space) ASCII character.


1 Answers

Introduction

Notepad++ is using Scintilla for the editor component. Scintilla has a function SCI_SETCONTROLCHARSYMBOL(int symbol) where you can set the character that will be used for the control characters. From the Scintilla Docs they describe the functionality:

SCI_SETCONTROLCHARSYMBOL(int symbol)

SCI_GETCONTROLCHARSYMBOL

By default, Scintilla displays control characters (characters with codes less than 32) in a rounded rectangle as ASCII mnemonics: "NUL", "SOH", "STX", "ETX", "EOT", "ENQ", "ACK", "BEL", "BS", "HT", "LF", "VT", "FF", "CR", "SO", "SI", "DLE", "DC1", "DC2", "DC3", "DC4", "NAK", "SYN", "ETB", "CAN", "EM", "SUB", "ESC", "FS", "GS", "RS", "US". These mnemonics come from the early days of signaling, though some are still used (LF = Line Feed, BS = Back Space, CR = Carriage Return, for example).

You can choose to replace these mnemonics by a nominated symbol with an ASCII code in the range 32 to 255. If you set a symbol value less than 32, all control characters are displayed as mnemonics. The symbol you set is rendered in the font of the style set for the character. You can read back the current symbol with the SCI_GETCONTROLCHARSYMBOL message. The default symbol value is 0.

There's probably a "right" way to do this, but I'm going to give you a very hack-y way of accomplishing this.

Technique

Edit the file %APPDATA%\Notepad++\shortcuts.xml using anything EXCEPT Notepad++.

Add the following to the <Macros> section of the file to manually add a macro:

<Macro name="RemoveControl" Ctrl="no" Alt="no" Shift="no" Key="0">
        <Action type="0" message="2388" wParam="32" lParam="0" sParam="" />
</Macro>

Note that you can set a shortcut with the Ctrl, Alt, Shift and Key attributes. The wParam will set the character which will be used instead of the spelled-out codes. In this case, code 32 is a Space in the ASCII standard. Message 2388 is the constant for the SCI_SETCONTROLCHARSYMBOL value.

Save the file

Use

Now you can change the behavior of Notepad++ at runtime. To use this do the following

Open Notepad++ Simply open the editor. If you open a file directly (ie. Edit with Notepad++ context menu) you will get weird behavior.

Activate the macro from the menu (or your shortcut). If there's a way to automate running a macro on startup it would be nice to add it here

Open your file. Nothing new here

Notes

  1. The positions with the control character will still be inverted (white text on black background by default).
  2. If you activate the macro when you have a document open it will not take effect right away. You'll have to do something to force the window to redraw.
  3. A look at the Scintilla.h file may open up other options which could be similarly exploited.
like image 63
Adam Hawkes Avatar answered Sep 27 '22 17:09

Adam Hawkes