The below code is from Jake VanderPlas's book "Python Data Science Handbook":
rand = np.random.RandomState(42)
X = rand.rand(10,2)
dist_sq = np.sum(X[:,np.newaxis,:] - X[np.newaxis,:,:]) ** 2, axis=-1)
K=2
nearest_partition = np.argpartition(dist_sq, K + 1, axis=1)
plt.scatter(X[:, 0], X[:, 1], s=100)
# draw lines from each point to its two nearest neighbors
for i in range(X.shape[0]):
for j in nearest_partition[i, :K+1]:
# plot a line from X[i] to X[j]
# use some zip magic to make it happen:
plt.plot(*zip(X[j], X[i]), color='black')
A few questions:
print(*zip(X[j], X[i]) and it seems that the X & Y coordinates from each point are unpacked and zipped together such that there are two tuples. Why is that the right logic here? Also, what is the right way to understand how the plot function is handling/interpreting each pair of tuples? I've never seen a pair of tuples passed in such a way to the plot function.Consider two points, a and b.
a = [1,2]
b = [3,4]
When we zip them we get:
print(list(zip(a, b))) # [[1,3], [2,4]]
We can see that the first element of each are paired together, and similarly for the second element of each. This is just how zip works; I suspect this makes sense for you. If those are (x,y) points, then we've just grouped the x's and y's together.
Now; consider the signature of plt.plot(x, y, ...). It expects the first argument to be all the x's, and the second argument to be all the y's. Well, the zip just grouped those together for us! We can use the * operator to spread those over the first two arguments. Notice that these are equivalent operations:
p = list(zip(a, b))
plt.plot(*p)
plt.plot(p[0], p[1])
Side note: to expand to more points we just add the extra points into the zip:
a = [1, 2]
b = [3, 4]
c = [5, 6]
print(list(zip(a, b, c))) # [[1, 3, 5], [2, 4, 6]]
plt.plot(*zip(a, b, c)) # plots the 3 points
* inside a function call converts a list (or other iterable) into a *args kind of argument.
zip with several lists iterates through them pairing up elements:
In [1]: list(zip([1,2,3],[4,5,6]))
Out[1]: [(1, 4), (2, 5), (3, 6)]
If we define a list:
In [2]: alist = [[1,2,3],[4,5,6]]
In [3]: list(zip(alist))
Out[3]: [([1, 2, 3],), ([4, 5, 6],)]
That zip didn't do much. But if we star it:
In [4]: list(zip(*alist))
Out[4]: [(1, 4), (2, 5), (3, 6)]
Check the zip docs - see the *args:
In [5]: zip?
Init signature: zip(self, /, *args, **kwargs)
Docstring:
zip(*iterables) --> A zip object yielding tuples until an input is exhausted.
>>> list(zip('abcdefg', range(3), range(4)))
[('a', 0, 0), ('b', 1, 1), ('c', 2, 2)]
The zip object yields n-length tuples, where n is the number of iterables
passed as positional arguments to zip(). The i-th element in every tuple
comes from the i-th iterable argument to zip(). This continues until the
shortest argument is exhausted.
Type: type
Subclasses:
* could also be used with a function like def foo(arg1, arg2, arg3):...
In plt.plot(*zip(X[j], X[i]), color='black'), plot has signature like plot(x, y, kwargs). I don't think this is any different from
plt.plot(X[j], X[i], color='black')
but I'd have to actually test some code.
def foo(x,y):
print(x,y)
In [11]: X = np.arange(10).reshape(5,2)
In [12]: foo(X[1],X[0])
[2 3] [0 1]
In [13]: foo(*zip(X[1],X[0]))
(2, 0) (3, 1)
list(*zip(...)) is a list version of a matrix transpose.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With