I have this dataframe
Python 3.9.0 (v3.9.0:9cf6752276, Oct 5 2020, 11:29:23)
[Clang 6.0 (clang-600.0.57)] on darwin
>>> import pandas as pd
>>> import datetime as datetime
>>> pd.__version__
'1.3.5'
>>> dates = [datetime.datetime(2012, 2, 3) , datetime.datetime(2012, 2, 4)]
>>> x = pd.DataFrame({'Time': dates, 'Selected': [0, 0], 'Nr': [123.4, 25.2]})
>>> x.set_index('Time', inplace=True)
>>> x
Selected Nr
Time
2012-02-03 0 123.4
2012-02-04 0 25.2
An integer value from an integer column is converted to a float in the example but I do not see the reason for this conversion. In both cases I assume I pick the value from the 'Selected'
column from the first row. What is going on?
>>> x['Selected'].iloc[0]
0
>>> x.iloc[0]['Selected']
0.0
>>> x['Selected'].dtype
dtype('int64')
Casting a float to an integer truncates the value, so if you have 3.999998 , and you cast it to an integer , you get 3 . The way to prevent this is to round the result.
Python also has a built-in function to convert floats to integers: int() . In this case, 390.8 will be converted to 390 . When converting floats to integers with the int() function, Python cuts off the decimal and remaining numbers of a float to create an integer.
Can mix integers and floats freely in operations. Integers and floating-point numbers can be mixed in arithmetic. Python 3 automatically converts integers to floats as needed.
To convert from left to right (a widening conversion), there is no cast necessary (which is why long to float is allowed).
The reason for the float value not being converted into an integer instead is due to type promotion that allows performing operations by converting data into a wider-sized data type without any loss of information. This is a simple case of Implicit type conversion in python.
Because it is representing estimates, a float contains a truncated value that only has significance to a certain number of digits. This may be a throw-back to a lesson on "significant digits" in a science course you took in school. The notation used is expressly to help define the precision of the values being represented.
Values of float are truncated when they are converted to any integer type. When you want to convert from float or real to character data, using the STR string function is usually more useful than CAST ( ). This is because STR enables more control over formatting. For more information, see STR (Transact-SQL) and Functions (Transact-SQL).
You are using an implicit conversion from float to varchar (255), which implicitly uses style 0. Your floats all have more than six digits, so they are represented in scientific notation. Floating point numbers are often shown in scientific notation. These types are used when range is more important than absolute precision.
x.iloc[0]
selects a single "row". A new series object is actually created. When it decides on the dtype of that row, a pd.Series
, it uses a floating point type, since that would not lose information in the "Nr"
column.
On the other hand, x['Selected'].iloc[0]
first selects a column, which will always preserve the dtype.
pandas
is fundamentally "column oriented". You can think of a dataframe as a dictionary of columns (it isn't, although I believe it used to essentially have that under the hood, but now it uses a more complex "block manager" approach, but these are internal implementation details)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With