How to make a fixed-size byte variable in Python

Tags:

Let's say, I have a string (Unicode if it matters) variable which is less than 100 bytes. I want to create another variable with exactly 100 byte in size which includes this string and is padded with zero or whatever. How would I do it in Python 3?

203

asked Jun 17 '14 18:06

Mikael S.

4 Answers

For assembling packets to go over the network, or for assembling byte-perfect binary files, I suggest using the struct module.

struct — Interpret bytes as packed binary data

Just for the string, you might not need struct, but as soon as you start also packing binary values, struct will make your life much easier.

Depending on your needs, you might be better off with an off-the-shelf network serialization library, such as Protocol Buffers; or you might even just use JSON for the wire format.

Protocol Buffer Basics: Python
PyMOTW - JavaScript Object Notation Serializer

190

answered Sep 29 '22 06:09

steveha

Something like this should work:

st = "具有"
by = bytes(st, "utf-8")
by += b"0" * (100 - len(by))
print(by)
# b'\xe5\x85\xb7\xe6\x9c\x890000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000'

Obligatory addendum since your original post seems to conflate strings with the length of their encoded byte representation: Python unicode explanation

answered Sep 29 '22 07:09

Daenyth

To pad with null bytes you can do it the way they do it in the stdlib base64 module.

some_data = b'foosdsfkl\x05'
null_padded = some_data + bytes(100 - len(some_data))

answered Sep 29 '22 06:09

Chris Wesseling

Here's a roundabout way of doing it:

>>> import sys
>>> a = "a"
>>> sys.getsizeof(a)
22
>>> a = "aa"
>>> sys.getsizeof(a)
23
>>> a = "aaa"
>>> sys.getsizeof(a)
24

So following this, an ASCII string of 100 bytes will need to be 79 characters long

>>> a = "".join(["a" for i in range(79)])
>>> len(a)
79
>>> sys.getsizeof(a)
100

This approach above is a fairly simple way of "calibrating" strings to figure out their lengths. You could automate a script to pad a string out to the appropriate memory size to account for other encodings.

def padder(strng):
    TARGETSIZE = 100
    padChar = "0"

    curSize = sys.getsizeof(strng)

    if curSize <= TARGETSIZE:
        for i in range(TARGETSIZE - curSize):
            strng = padChar + strng

        return strng
    else:
        return strng  # Not sure if you need to handle strings that start longer than your target, but you can do that here

answered Sep 29 '22 05:09

wnnmaw

Related questions
                            
                                Programmatically importing module via importlib - __path__ not set?
                            
                                How to *not* display 'NaN' in ipython notebook (html table of pandas dataframe)?
                            
                                Unintended multithreading in Python (scikit-learn)
                            
                                Python range( ) is not giving me a list [duplicate]
                            
                                Why does object.__new__ work differently in these three cases
                            
                                Numpy array of random matrices
                            
                                How to quickly encrypt a password string in Django without an User Model?
                            
                                What is the Matlab equivalent of the yield keyword in Python?
                            
                                How to decode a Base64 string in Scala or Java?
                            
                                How to preprocess data for machine learning? [closed]
                            
                                NameError: name '__main__' is not defined [closed]
                            
                                Why doesn't list.reverse return a list?
                            
                                Combining logic statements AND in numpy array
                            
                                Using python's urllib.quote_plus on utf-8 strings with 'safe' arguments
                            
                                Return None when attribute does not exist
                            
                                Python grouping elements in a list in increasing size
                            
                                HeartBleed python test script
                            
                                What makes lists unhashable?
                            
                                MongoDB group with multiple id
                            
                                installing pandas on python - where did numpy go?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to make a fixed-size byte variable in Python

Tags:

python

string

python-3.x

byte