d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
  'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df ['one']
Output:
    a    1.0
    b    2.0
    c    3.0
    d    NaN
Name: one, dtype: float64
The value is set as float
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
  'two' : pd.Series([1, 2, 3], index=['a', 'b', 'c'])}
df = pd.DataFrame(d)
print df ['one']
Output:
a    1
b    2
c    3
Name: one, dtype: int64
But now the value is set as int64.
The difference is the first one, there is a NaN in the value.
What is the rule behind the set up of the data types in the above examples?
Thanks!
Type of NaN is float, so pandas will infer all ints numbers to be floats too.
This can be easily checked :
>>> type(np.nan) 
float 
I would recommend this interesting read
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With