Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to detect hidden characters in string (for example zero width space) during debugging

Tags:

string

c#

Is there any easy way how to detect (during debugging), that string contains some hidden character (for example zero width space)?

Example: During debugging I'm comparing two differnet strings and they seem equal to my eyes. Of course they differ in some hidden charaters. How to find the difference?

I used string.ToCharArray() method in "Immediate window" of Visual Studio but there must be more comfortable way.

like image 333
cartas Avatar asked Apr 03 '12 12:04

cartas


People also ask

How do you find the zero width space?

The zero width space is Unicode character U+200B. (HTML ​). It's remarkably hard to type. On Windows you can type Alt-8203.

What is a zero width space used for?

The zero-width space (​), abbreviated ZWSP, is a non-printing character used in computerized typesetting to indicate word boundaries to text-processing systems in scripts that do not use explicit spacing, or after characters (such as the slash) that are not followed by a visible space but after which there may ...

How do you get rid of zero width space?

To remove zero-width space characters from a JavaScript string, we can use the JavaScript string replace method that matches all zero-width characters and replace them with empty strings. Zero-width characters in Unicode includes: U+200B zero width space.

How do you add zero width space in HTML?

The code &#8203 is the HTML code for the zero width space.


1 Answers

You can use this in the immediate window:

str.Contains("\u8203");

Or put it in the watch window so you'll just have to click the refresh button near the watched value to see the result, rather than re-entering it to the immediate (although you can always press up and then enter to re-enter the last command!)

To check for ANY hidden character, you can either have a static array with all hidden characters and check:

HIDDENS.Any(c => str.Contains(c.ToString())

And preferable even save the hidden characters as one-length strings and then do:

HIDDENS.Any(str.Contains)

OR you could be really sophisticated and do THIS:

private static readonly Bitmap BMP = new Bitmap(1000, 1000);
private static readonly Graphics GRAPHICS = Graphics.FromImage(BMP);
private static readonly Font FONT = new Font("Arial", 20);
private static readonly RectangleF RECT = new RectangleF(0, 0, 1000, 1000);

public static bool CheckInvisibleChars(string text)
{
    var stringFormat1 = new StringFormat(StringFormatFlags.MeasureTrailingSpaces);
    stringFormat1.SetMeasurableCharacterRanges(
        Enumerable.Range(0, text.Length - 2).Select(i => new CharacterRange(i, 1)).ToArray());

    return GRAPHICS.MeasureCharacterRanges(text, FONT, RECT, stringFormat1).Any(
        reg => reg.GetBounds(GRAPHICS).Width.Equals(0f));
}

From here it should also be easy to return information about each hidden character, etc.

like image 83
SimpleVar Avatar answered Sep 21 '22 01:09

SimpleVar