What's the best way to split a string into fixed length chunks and work with them in Python?

Tags:

python

I am reading in a line from a text file using:

   file = urllib2.urlopen("http://192.168.100.17/test.txt").read().splitlines()

and outputting it to an LCD display, which is 16 characters wide, in a telnetlib.write command. In the event that the line read is longer than 16 characters I want to break it down into sections of 16 character long strings and push each section out after a certain delay (e.g. 10 seconds), once complete the code should move onto the next line of the input file and continue.

I've tried searching various solutions and reading up on itertools etc. but my understanding of Python just isn't sufficient to get anything to work without doing it in a very long winded way using a tangled mess of if then else statements that's probably going to tie me in knots!

What's the best way for me to do what I want?

228

asked Sep 17 '13 16:09

LostRob

2 Answers

One solution would be to use this function:

def chunkstring(string, length):     return (string[0+i:length+i] for i in range(0, len(string), length))

This function returns a generator, using a generator comprehension. The generator returns the string sliced, from 0 + a multiple of the length of the chunks, to the length of the chunks + a multiple of the length of the chunks.

You can iterate over the generator like a list, tuple or string - for i in chunkstring(s,n): , or convert it into a list (for instance) with list(generator). Generators are more memory efficient than lists because they generator their elements as they are needed, not all at once, however they lack certain features like indexing.

This generator also contains any smaller chunk at the end:

>>> list(chunkstring("abcdefghijklmnopqrstuvwxyz", 5)) ['abcde', 'fghij', 'klmno', 'pqrst', 'uvwxy', 'z']

Example usage:

text = """This is the first line.            This is the second line.            The line below is true.            The line above is false.            A short line.            A very very very very very very very very very long line.            A self-referential line.            The last line.         """  lines = (i.strip() for i in text.splitlines())  for line in lines:     for chunk in chunkstring(line, 16):         print(chunk)

115

answered Oct 02 '22 14:10

rlms

My favorite way to solve this problem is with the re module.

import re  def chunkstring(string, length):   return re.findall('.{%d}' % length, string)

One caveat here is that re.findall will not return a chunk that is less than the length value, so any remainder is skipped.

However, if you're parsing fixed-width data, this is a great way to do it.

For example, if I want to parse a block of text that I know is made up of 32 byte characters (like a header section) I find this very readable and see no need to generalize it into a separate function (as in chunkstring):

for header in re.findall('.{32}', header_data):   ProcessHeader(header)

answered Oct 02 '22 13:10

carl.anderson

Related questions
                            
                                Use pandas.shift() within a group
                            
                                Testing for reference equality in Python
                            
                                python - can lambda have more than one return
                            
                                Get AWS Account ID from Boto
                            
                                Pandas split column into multiple columns by comma
                            
                                Google App Engine and 404 error
                            
                                Using Sphinx to write personal websites and blogs
                            
                                Correct way to emulate single precision floating point in python?
                            
                                How to fetch only specific columns of a table in django? [duplicate]
                            
                                Python list function argument names [duplicate]
                            
                                Using python's logging module to log all exceptions and errors
                            
                                How to return all the minimum indices in numpy
                            
                                how to get which statements are missed in python test coverage
                            
                                ValueError: Can not squeeze dim[1], expected a dimension of 1, got 3 for 'sparse_softmax_cross_entropy_loss
                            
                                Using Cython with Django. Does it make sense?
                            
                                How to run PyCharm in Ubuntu - "Run in Terminal" or "Run"?
                            
                                Python 3 UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d
                            
                                Sampling uniformly distributed random points inside a spherical volume
                            
                                How to get list of all variables in jinja 2 templates
                            
                                Execute .sql schema in psycopg2 in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With