Pandas DataFrame, default data type for 1, 2, 3, and NaN values

Question

d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
  'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df ['one']

Output:

    a    1.0

    b    2.0

    c    3.0

    d    NaN

Name: one, dtype: float64

The value is set as float

d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
  'two' : pd.Series([1, 2, 3], index=['a', 'b', 'c'])}

df = pd.DataFrame(d)
print df ['one']

Output:

a    1

b    2

c    3

Name: one, dtype: int64

But now the value is set as int64.

The difference is the first one, there is a NaN in the value.

What is the rule behind the set up of the data types in the above examples?

Thanks!

rafaelc · Accepted Answer

Type of NaN is float, so pandas will infer all ints numbers to be floats too.

This can be easily checked :

>>> type(np.nan) 
float

I would recommend this interesting read

Pandas DataFrame, default data type for 1, 2, 3, and NaN values

Tags:

python

pandas

dataframe

searain

1 Answers

rafaelc

Recent Activity

Donate For Us

Pandas DataFrame, default data type for 1, 2, 3, and NaN values

Tags:

python

pandas

dataframe

searain

1 Answers

rafaelc

Related questions

Recent Activity

Donate For Us