Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most Pythonic Way to Split an Array by Repeating Elements

Tags:

python

I have a list of items that I want to split based on a delimiter. I want all delimiters to be removed and the list to be split when a delimiter occurs twice. For example, if the delimiter is 'X', then the following list:

['a', 'b', 'X', 'X', 'c', 'd', 'X', 'X', 'f', 'X', 'g']

Would turn into:

[['a', 'b'], ['c', 'd'], ['f', 'g']]

Notice that the last set is not split.

I've written some ugly code that does this, but I'm sure there is something nicer. Extra points if you can set an arbitrary length delimiter (i.e. split the list after seeing N delimiters).

like image 937
speedplane Avatar asked Dec 16 '11 05:12

speedplane


People also ask

Which of the following will split the array row wise?

vsplit' is equivalent to 'split' with axis parameter = 0. This function split an array into multiple sub-arrays vertically (row-wise).

Can you split () a list in Python?

Python String split() MethodThe split() method splits a string into a list. You can specify the separator, default separator is any whitespace. Note: When maxsplit is specified, the list will contain the specified number of elements plus one.

How do you split 3 elements in a list in Python?

To split the elements of a list in Python: Use a list comprehension to iterate over the list. On each iteration, call the split() method to split each string. Return the part of each string you want to keep.


2 Answers

I don't think there's going to be a nice, elegant solution to this (I'd love to be proven wrong of course) so I would suggest something straightforward:

def nSplit(lst, delim, count=2):
    output = [[]]
    delimCount = 0
    for item in lst:
        if item == delim:
            delimCount += 1
        elif delimCount >= count:
            output.append([item])
            delimCount = 0
        else:
            output[-1].append(item)
            delimCount = 0
    return output

 

>>> nSplit(['a', 'b', 'X', 'X', 'c', 'd', 'X', 'X', 'f', 'X', 'g'], 'X', 2)
[['a', 'b'], ['c', 'd'], ['f', 'g']]
like image 162
cobbal Avatar answered Oct 14 '22 02:10

cobbal


Here's a way to do it with itertools.groupby():

import itertools

class MultiDelimiterKeyCallable(object):
    def __init__(self, delimiter, num_wanted=1):
        self.delimiter = delimiter
        self.num_wanted = num_wanted

        self.num_found = 0

    def __call__(self, value):
        if value == self.delimiter:
            self.num_found += 1
            if self.num_found >= self.num_wanted:
                self.num_found = 0
                return True
        else:
            self.num_found = 0

def split_multi_delimiter(items, delimiter, num_wanted):
    keyfunc = MultiDelimiterKeyCallable(delimiter, num_wanted)

    return (list(item
                 for item in group
                 if item != delimiter)
            for key, group in itertools.groupby(items, keyfunc)
            if not key)

items = ['a', 'b', 'X', 'X', 'c', 'd', 'X', 'X', 'f', 'X', 'g']

print list(split_multi_delimiter(items, "X", 2))

I must say that cobbal's solution is much simpler for the same results.

like image 4
Michael Hoffman Avatar answered Oct 14 '22 01:10

Michael Hoffman