In R (thanks to magrittr) you can now perform operations with a more functional piping syntax via %>%
. This means that instead of coding this:
> as.Date("2014-01-01") > as.character((sqrt(12)^2)
You could also do this:
> "2014-01-01" %>% as.Date > 12 %>% sqrt %>% .^2 %>% as.character
To me this is more readable and this extends to use cases beyond the dataframe. Does the python language have support for something similar?
%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. It is defined by the package magrittr (CRAN) and is heavily used by dplyr (CRAN).
Pipe is a Python library that enables you to use pipes in Python. A pipe ( | ) passes the results of one method to another method. I like Pipe because it makes my code look cleaner when applying multiple methods to a Python iterable. Since Pipe only provides a few methods, it is also very easy to learn Pipe.
The pipe operator, written as %>% , has been a longstanding feature of the magrittr package for R. It takes the output of one function and passes it into another function as an argument. This allows us to link a sequence of analysis steps.
Pipe is a beautiful package that takes Python's ability to handle data to the next level. It takes a SQL-like declarative approach to manipulate elements in a collection. It could filter, transform, sort, remove duplicates, perform group by operations, and a lot more without needing to write a gazillion lines of code.
Pipes are a new feature in Pandas 0.16.2.
Example:
import pandas as pd from sklearn.datasets import load_iris x = load_iris() x = pd.DataFrame(x.data, columns=x.feature_names) def remove_units(df): df.columns = pd.Index(map(lambda x: x.replace(" (cm)", ""), df.columns)) return df def length_times_width(df): df['sepal length*width'] = df['sepal length'] * df['sepal width'] df['petal length*width'] = df['petal length'] * df['petal width'] x.pipe(remove_units).pipe(length_times_width) x
NB: The Pandas version retains Python's reference semantics. That's why length_times_width
doesn't need a return value; it modifies x
in place.
One possible way of doing this is by using a module called macropy
. Macropy allows you to apply transformations to the code that you have written. Thus a | b
can be transformed to b(a)
. This has a number of advantages and disadvantages.
In comparison to the solution mentioned by Sylvain Leroux, The main advantage is that you do not need to create infix objects for the functions you are interested in using -- just mark the areas of code that you intend to use the transformation. Secondly, since the transformation is applied at compile time, rather than runtime, the transformed code suffers no overhead during runtime -- all the work is done when the byte code is first produced from the source code.
The main disadvantages are that macropy requires a certain way to be activated for it to work (mentioned later). In contrast to a faster runtime, the parsing of the source code is more computationally complex and so the program will take longer to start. Finally, it adds a syntactic style that means programmers who are not familiar with macropy may find your code harder to understand.
run.py
import macropy.activate # Activates macropy, modules using macropy cannot be imported before this statement # in the program. import target # import the module using macropy
target.py
from fpipe import macros, fpipe from macropy.quick_lambda import macros, f # The `from module import macros, ...` must be used for macropy to know which # macros it should apply to your code. # Here two macros have been imported `fpipe`, which does what you want # and `f` which provides a quicker way to write lambdas. from math import sqrt # Using the fpipe macro in a single expression. # The code between the square braces is interpreted as - str(sqrt(12)) print fpipe[12 | sqrt | str] # prints 3.46410161514 # using a decorator # All code within the function is examined for `x | y` constructs. x = 1 # global variable @fpipe def sum_range_then_square(): "expected value (1 + 2 + 3)**2 -> 36" y = 4 # local variable return range(x, y) | sum | f[_**2] # `f[_**2]` is macropy syntax for -- `lambda x: x**2`, which would also work here print sum_range_then_square() # prints 36 # using a with block. # same as a decorator, but for limited blocks. with fpipe: print range(4) | sum # prints 6 print 'a b c' | f[_.split()] # prints ['a', 'b', 'c']
And finally the module that does the hard work. I've called it fpipe for functional pipe as its emulating shell syntax for passing output from one process to another.
fpipe.py
from macropy.core.macros import * from macropy.core.quotes import macros, q, ast macros = Macros() @macros.decorator @macros.block @macros.expr def fpipe(tree, **kw): @Walker def pipe_search(tree, stop, **kw): """Search code for bitwise or operators and transform `a | b` to `b(a)`.""" if isinstance(tree, BinOp) and isinstance(tree.op, BitOr): operand = tree.left function = tree.right newtree = q[ast[function](ast[operand])] return newtree return pipe_search.recurse(tree)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With