Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to identify a string as being a byte literal?

In Python 3, if I have a string such that:

print(some_str)

yields something like this:

b'This is the content of my string.\r\n'

I know it's a byte literal.

Is there a function that can be used to determine if that string is in byte literal format (versus having, say, the Unicode 'u' prefix) without first interpreting? Or is there another best practice for handling this? I have a situation wherein getting a byte literal string needs to be dealt with differently than if it's in Unicode. In theory, something like this:

if is_byte_literal(some_str):
    // handle byte literal case
else:
    // handle unicode case
like image 512
Nathaniel Ford Avatar asked Sep 29 '16 19:09

Nathaniel Ford


People also ask

What is a byte string literal?

Byte and byte string literals A byte literal is a single ASCII character (in the U+0000 to U+007F range) or a single escape preceded by the characters U+0062 ( b ) and U+0027 (single-quote), and followed by the character U+0027 .

Is byte [] same as string?

Byte objects are sequence of Bytes, whereas Strings are sequence of characters. Byte objects are in machine readable form internally, Strings are only in human readable form. Since Byte objects are machine readable, they can be directly stored on the disk.

How do you convert bytes to strings?

One method is to create a string variable and then append the byte value to the string variable with the help of + operator. This will directly convert the byte value to a string and add it in the string variable.

How many bytes is a string?

But what about a string? A string is composed of: An 8-byte object header (4-byte SyncBlock and a 4-byte type descriptor)


2 Answers

The easiest and, arguably, best way to do this would be by utilizing the built-in isinstance with the bytes type:

some_str = b'hello world'
if isinstance(some_str, bytes):
    print('bytes')
elif isinstance(some_str, str):
    print('str')
else:
    # handle

Since, a byte literal will always be an instance of bytes, isinstance(some_str, bytes) will, of course, evaluate to True.

like image 82
Dimitris Fasarakis Hilliard Avatar answered Oct 23 '22 12:10

Dimitris Fasarakis Hilliard


Just to complement the other answer, the built-in type also gives you this information. You can use it with is and the corresponding type to check accordingly.

For example, in Python 3:

a = 'foo'
print(type(a) is str)   # prints `True`
a = b'foo'
print(type(a) is bytes) # prints `True` as well
like image 43
怀春춘 Avatar answered Oct 23 '22 13:10

怀春춘