import numpy as np
import pandas as pd
consider numpy array a
a = np.array([None, None], dtype=object)
print(a)
[None None]
And dfa
dfa = pd.DataFrame(a)
print(dfa)
0
0 None
1 None
Now consider numpy array b
b = np.empty_like(a)
print(b)
[None None]
It appears the same as a
(a == b).all()
True
dfb = pd.DataFrame(b) # Fine so far
print(dfb.values)
[[None]
[None]]
However
print(dfb) # BOOM!!!
empty_like() function in Numpy is used to create the new array with the same shape and type as the given array. The shape and data-type of the prototype define these same attributes of the returned array. Overrides the data type of the result. It overrides the memory layout of the result.
Numpy is memory efficient. Pandas has a better performance when a number of rows is 500K or more. Numpy has a better performance when number of rows is 50K or less. Indexing of the pandas series is very slow as compared to numpy arrays.
pandas provides a bunch of C or Cython optimized functions that can be faster than the NumPy equivalent function (e.g. reading text from text files). If you want to do mathematical operations like a dot product, calculating mean, and some more, pandas DataFrames are generally going to be slower than a NumPy array.
Pandas expands on NumPy by providing easy to use methods for data analysis to operate on the DataFrame and Series classes, which are built on NumPy's powerful ndarray class.
As reported here, this is a bug, which is fixed in the master branch of pandas
/ the upcoming 0.19.0
release.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With