Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the best way to split a string into fixed length chunks and work with them in Python?

Tags:

python

I am reading in a line from a text file using:

   file = urllib2.urlopen("http://192.168.100.17/test.txt").read().splitlines() 

and outputting it to an LCD display, which is 16 characters wide, in a telnetlib.write command. In the event that the line read is longer than 16 characters I want to break it down into sections of 16 character long strings and push each section out after a certain delay (e.g. 10 seconds), once complete the code should move onto the next line of the input file and continue.

I've tried searching various solutions and reading up on itertools etc. but my understanding of Python just isn't sufficient to get anything to work without doing it in a very long winded way using a tangled mess of if then else statements that's probably going to tie me in knots!

What's the best way for me to do what I want?

like image 228
LostRob Avatar asked Sep 17 '13 16:09

LostRob


People also ask

How do you split a string into chunks in Python?

Python split() method is used to split the string into chunks, and it accepts one argument called separator. A separator can be any character or a symbol. If no separators are defined, then it will split the given string and whitespace will be used by default.

How do you split a string by specific length in Python?

Slicing the given string based on the length of split. Converting the given string to a list with list(str) function, where characters of the string breakdown to form the the elements of a list. Then do the required operation and join them with 'specified character between the characters of the original string'.

How do I split a string into multiple parts?

Answer: You just have to pass (“”) in the regEx section of the Java Split() method. This will split the entire String into individual characters.


2 Answers

One solution would be to use this function:

def chunkstring(string, length):     return (string[0+i:length+i] for i in range(0, len(string), length)) 

This function returns a generator, using a generator comprehension. The generator returns the string sliced, from 0 + a multiple of the length of the chunks, to the length of the chunks + a multiple of the length of the chunks.

You can iterate over the generator like a list, tuple or string - for i in chunkstring(s,n): , or convert it into a list (for instance) with list(generator). Generators are more memory efficient than lists because they generator their elements as they are needed, not all at once, however they lack certain features like indexing.

This generator also contains any smaller chunk at the end:

>>> list(chunkstring("abcdefghijklmnopqrstuvwxyz", 5)) ['abcde', 'fghij', 'klmno', 'pqrst', 'uvwxy', 'z'] 

Example usage:

text = """This is the first line.            This is the second line.            The line below is true.            The line above is false.            A short line.            A very very very very very very very very very long line.            A self-referential line.            The last line.         """  lines = (i.strip() for i in text.splitlines())  for line in lines:     for chunk in chunkstring(line, 16):         print(chunk) 
like image 115
rlms Avatar answered Oct 02 '22 14:10

rlms


My favorite way to solve this problem is with the re module.

import re  def chunkstring(string, length):   return re.findall('.{%d}' % length, string) 

One caveat here is that re.findall will not return a chunk that is less than the length value, so any remainder is skipped.

However, if you're parsing fixed-width data, this is a great way to do it.

For example, if I want to parse a block of text that I know is made up of 32 byte characters (like a header section) I find this very readable and see no need to generalize it into a separate function (as in chunkstring):

for header in re.findall('.{32}', header_data):   ProcessHeader(header) 
like image 41
carl.anderson Avatar answered Oct 02 '22 13:10

carl.anderson