I'm trying to print a recursive data structure in Perl for debugging purposes. Hash of hashes of arrays of hashes, that kind of thing ...
Some of its basic data elements are printable strings, so I'm printing those. Unfortunately, some of the basic data elements are binary (think content from image files). They screw up my debug output with gibberish.
How would I detect which is which, so I can avoid printing the binary as if it was a string?
(I am aware of Data::Dumper. My question is not about whether or not I should replicate that functionality, but about how to distinguish between text and binary strings.)
perlrecharclass defines these character classes:
Any printable character, excluding a space. Any character that is graphical, that is, visible. This class consists of all alphanumeric characters and all punctuation characters.
Any printable character, including a space. All printable characters, which is the set of all graphical characters plus those whitespace characters which are not also controls.
So you could match on a character that does not have the Unicode property (note capital P), e.g.:
/\P{XPosixPrint}/
I suspect what you really want is to detect control characters, which screw up the terminal (note lower-case p):
/\p{XPosixCntrl}/
Something like this will get you started
$string_is_unprintable = $string =~ /[^\t\n\x20-x7e]/
Depending on your locale and terminal settings, you might also tolerate characters with ordinal values above 127 (0x7f).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With