Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I identify invisible characters in python strings?

Tags:

python

string

SHORT VERSION

I am retrieving a database value, which contains a short, but full HTML structure. I want to strip away all of the HTML-tags, and just end up with a single value. The HTML surrounding my relevant info, is always the same, and I just need to figure out what kind of line breaks, tabs or whitespaces the string contains, so that I can make a match, and remove it.

Is there a place I can paste the String online, or another way I can check the actual content of the String, so that I'll be able to remove it?

LONG VERSION, and what I've already tried:

The String is retrieved from a HP Quality Center database, and printed in the console of the automated test execution, the string is interpreted to show as two whitespaces. When pasted into word, eclipse or the QC script editor, it is shown as a linebreak.

I've tried to replace the whitespaces with \n, double whitespace and ¶. Nothing works.

I am translatnig this script from a working VBScript. The problematic invisible characters are defined as vbcrlf and VBCRLF there. For some reason they use lower case in the replace String before the relevant parameter value, and upper case in the string that comes after my relevant substring. They are defined as variables, and are not inside the String itself: <html>"&vbcrlf&"<body>"&vbcrlf&"<div...

This website suggests that I should use \n https://answers.yahoo.com/question/index?qid=20070506205148AAmr92N, as they write:

vbCrLf = "\n" # Carriage returnlinefeed combination

I am a little confused by the inconsitency of the upper/lower case use here though...

EDIT:

After googling Carriage returnlinefeed combination, i learned that it can be defined as /r/n here: Order of carriage return and new line feed.

But I spent an awful long time finding it, and it doesn't answer my question, of how I better can identify exactly what kind of invisible characters a string contains. I'll leave the question open.

like image 719
jumps4fun Avatar asked Jul 10 '15 12:07

jumps4fun


1 Answers

To view the contents of a string (including it's "hidden" values) you can always do:

print( [data] )
# or
print( repr(data) )

If you're in a system which you described in the comments you can also do:

with open('/var/log/debug.log', 'w') as fh:
    fh.write( str( [data] ) )

This will however just give you a general idea of what your data looks like, but if that solves your question or problem then that is great. If you need further assistance, edit your question or submit a new one :)

like image 65
Torxed Avatar answered Nov 14 '22 02:11

Torxed