Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Split array into multiple arrays

Tags:

python

arrays

I have a array contains file names like below:

['001_1.png', '001_2.png', '001_3.png', '002_1.png','002_2.png', '003_1.png', '003_2.png', '003_3.png', '003_4.png', ....]

I want to quickly group these files into multiple arrays like this:

[['001_1.png', '001_2.png', '001_3.png'], ['002_1.png', '002_2.png'], ['003_1.png', '003_2.png', '003_3.png', '003_4.png'], ...]

Could anyone tell me how to do it in few lines in python?

like image 564
eric2323223 Avatar asked May 04 '18 07:05

eric2323223


People also ask

How do you split an array into multiple arrays in Python?

Use the hsplit() method to split the 2-D array into three 2-D arrays along rows. Note: Similar alternates to vstack() and dstack() are available as vsplit() and dsplit() .

How do you split an array in Python?

array_split() method in Python is used to split an array into multiple sub-arrays of equal size. In Python, an array is a data structure that is used to store multiple items of the same type together.

How do you split a multidimensional array in Python?

An array needs to explicitly import the array module for declaration. A 2D array is simply an array of arrays. The numpy. array_split() method in Python is used to split a 2D array into multiple sub-arrays of equal size.

Can you divide two arrays in Python?

divide is with two same-sized arrays (i.e., arrays with exactly the same number of rows and columns). If the two input arrays have the same shape, then Numpy divide will divide the elements of the first array by the elements of the second array, in an element-wise fashion.


2 Answers

If your data is already sorted by the file name, you can use itertools.groupby:

files = ['001_1.png', '001_2.png', '001_3.png', '002_1.png','002_2.png',
        '003_1.png', '003_2.png', '003_3.png']

import itertools

keyfunc = lambda filename: filename[:3]

# this creates an iterator that yields `(group, filenames)` tuples,
# but `filenames` is another iterator
grouper = itertools.groupby(files, keyfunc)

# to get the result as a nested list, we iterate over the grouper to
# discard the groups and turn the `filenames` iterators into lists
result = [list(files) for _, files in grouper]

print(list(result))
# [['001_1.png', '001_2.png', '001_3.png'],
#  ['002_1.png', '002_2.png'],
#  ['003_1.png', '003_2.png', '003_3.png']]

Otherwise, you can base your code on this recipe, which is more efficient than sorting the list and then using groupby.

  • Input: Your input is a flat list, so use a regular ol' loop to iterate over it:

    for filename in files:
    
  • Group identifier: The files are grouped by the first 3 letters:

    group = filename[:3]
    
  • Output: The output should be a nested list rather than a dict, which can be done with

    result = list(groupdict.values())
    

Putting it together:

files = ['001_1.png', '001_2.png', '001_3.png', '002_1.png','002_2.png',
        '003_1.png', '003_2.png', '003_3.png']

import collections

groupdict = collections.defaultdict(list)
for filename in files:
    group = filename[:3]
    groupdict[group].append(filename)

result = list(groupdict.values())

print(result)
# [['001_1.png', '001_2.png', '001_3.png'],
#  ['002_1.png', '002_2.png'],
#  ['003_1.png', '003_2.png', '003_3.png']]

Read the recipe answer for more details.

like image 182
Aran-Fey Avatar answered Sep 20 '22 03:09

Aran-Fey


Something like that should work:

import itertools


mylist = [...]
[list(v) for k,v in itertools.groupby(mylist, key=lambda x: x[:3])]

If input list isn't sorted, than use something like that:

import itertools


mylist = [...]
keyfunc = lambda x:x[:3]
mylist = sorted(mylist, key=keyfunc)
[list(v) for k,v in itertools.groupby(mylist, key=keyfunc)]
like image 45
oxyum Avatar answered Sep 22 '22 03:09

oxyum