Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

process a sequence in a java.util.stream manner in python

Does someone know how I would write a sequence processing in a java stream API manner in Python ? The idea is to write the operations in the order they will happen:

myList.stream()
    .filter(condition)
    .map(action1)
    .map(action2)
    .collect(Collectors.toList());

Now in python I could do

[action2(action1(item)) for item in my_list if condition(item)]

But that is the opposite order.

How could I have something in the correct order ? Obviously I could use variables but then I would have to find a name for each partial result.

like image 624
Adrien H Avatar asked Mar 12 '26 19:03

Adrien H


1 Answers

There's a library that already does exactly what you are looking for, i.e. lazy evaluation and the order of operations is the same with how it's written, as well as many more other good stuff like multiprocess or multithreading Map/Reduce. It's named pyxtension, it's prod ready, covered by unit-tests and maintained on PyPi, and it's released under MIT license - so you can use it for free in any commercial project. Your code would be rewritten in this form:

from pyxtension.streams import stream

stream(myList)
    .filter(condition)
    .map(action1)
    .map(action2)
    .toList()

and

stream(myList)
    .filter(condition)
    .mpmap(action1)   # this is for multi-processing map
    .fastmap(action2) # this is multi-threaded map
    .toList()

Note that the last statement toList() does exact what you expect - it collects data as it would happen in a Spark RDD.

like image 149
asu Avatar answered Mar 14 '26 07:03

asu



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!