I guess it's an academic question, but the second result does not make sense to me. Shouldn't it be as thoroughly empty as the first? What is the rationale for this behavior?
from itertools import product
one_empty = [ [1,2], [] ]
all_empty = []
print [ t for t in product(*one_empty) ] # []
print [ t for t in product(*all_empty) ] # [()]
Updates
Thanks for all of the answers -- very informative.
Wikipedia's discussion of the Nullary Cartesian Product provides a definitive statement:
The Cartesian product of no sets ... is the singleton set containing the empty tuple.
And here is some code you can use to work through the insightful answer from sth:
from itertools import product
def tproduct(*xss):
return ( sum(rs, ()) for rs in product(*xss) )
def tup(x):
return (x,)
xs = [ [1, 2], [3, 4, 5] ]
ys = [ ['a', 'b'], ['c', 'd', 'e'] ]
txs = [ map(tup, x) for x in xs ] # [[(1,), (2,)], [(3,), (4,), (5,)]]
tys = [ map(tup, y) for y in ys ] # [[('a',), ('b',)], [('c',), ('d',), ('e',)]]
a = [ p for p in tproduct( *(txs + tys) ) ]
b = [ p for p in tproduct( tproduct(*txs), tproduct(*tys) ) ]
assert a == b
From a mathematical point of view the product over no elements should yield the neutral element of the operation product, whatever that is.
For example on integers the neutral element of multiplication is 1, since 1 ⋅ a = a for all integers a. So an empty product of integers should be 1. When implementing a python function that returns the product of a list of numbers, this happens naturally:
def iproduct(lst):
result = 1
for i in lst:
result *= i
return result
For the correct result to be calculated with this algorithm, result
needs to be initialized with 1
. This leads to a return value of 1
when the function is called on an empty list.
This return value is also very reasonable for the purpose of the function. With a good product function it shouldn't matter if you first concat two lists and then build the product of the elements, or if you first build the product of both individual lists and then multiply the results:
iproduct(xs + ys) == iproduct(xs) * iproduct(ys)
If xs
or ys
is empty that only works if iproduct([]) == 1
.
Now the more complicated product()
on iterators. Here also, from a mathematical point of view, product([])
should return the neutral element of that operation, whatever that is. It is not []
since product([], xs) == []
, while for the neutral elements product([], xs) == xs
should hold. It turns out, though, that [()]
also isn't a neutral element:
>>> list(product([()], [1,2,3]))
[((), 1), ((), 2), ((), 3)]
In fact, product()
is not really a very nice mathematical product at all, since this above equation doesn't hold:
product(*(xs + ys)) != product(product(*xs), product(*ys))
Each application of product generates an additional layer of tuples and there is no way around that, so there can't even be a real neutral element. [()]
comes pretty close though, it doesn't add or remove any elements, it just adds an empty tuple to each.
[()]
would in fact be the neutral element of this slightly adapted product function that only operates on lists of tuples, but doesn't add additional tuple layers on each application:
def tproduct(*xss):
# the parameters have to be lists of tuples
return (sum(rs, ()) for rs in product(*xss))
For this function the above product equation holds:
def tup(x): return (x,)
txs = [map(tup, x) for x in xs]
tys = [map(tup, y) for y in ys]
tproduct(*(txs + tys)) == tproduct(tproduct(*txs), tproduct(*tys))
With the additional preprocessing step of packing the input lists into tuples, tproduct()
gives the same result as product()
, but behaves nicer from a mathematical point of view. Also its neutral element is [()]
,
So [()]
makes some sense as the neutral element of this kind of list multiplication. Even if it doesn't exactly fit product()
it is a good choice for this function since it for example allows to define tproduct()
without the need to introduce a special case for empty input.
As @sth already indicated, this behaviour is correct from a mathematical viewpoint. All you really need to convince yourself of is that list(itertools.product())
should have exactly one element, since once you know that it's clear what that element should be: it's got to be (for consistency) a tuple of length 0, and there's only one of those.
But the number of elements of itertools.product(l1, l2, l3, ...)
should just be the product of the lengths of l1
, l2
, l3
, ... . So the number of elements of itertools.product()
should be the size of the empty product, and there's no shortage of internet sources that should persuade you that the empty product is 1.
I just wanted to point out that this is the correct practical definition as well as the correct mathematical one; that is, it's the definition that's most likely to 'just work' in boundary cases. For an example, suppose that you want to generate all strings of length n
consisting of decimal digits, with the first digit nonzero. You might do something like:
import itertools
def decimal_strings(n):
"""Generate all digit strings of length n that don't start with 0."""
for lead_digit in '123456789':
for tail in itertools.product('0123456789', repeat=n-1):
yield lead_digit + ''.join(tail)
What should this produce when n = 1
? Well, in that case, you end up calling itertools.product
with an empty product (repeat = 0
). If it returned nothing, then the body of the inner for
loop above would never be executed, so decimal_strings(1)
would be an empty iterator; almost certainly not what you want. But since itertools.product('0123456789', repeat=0)
returns a single tuple, you get the expected result:
>>> list(decimal_strings(1))
['1', '2', '3', '4', '5', '6', '7', '8', '9']
(When n = 0
, of course, this function correctly raises a ValueError.)
So in short, the definition is mathematically sound, and more often that not it's also what you want. It's definitely not a Python bug!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With