Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change dtype of one column in DataFrame?

I want to change dtype of one data frame column (from datetime64 to object).

First of all, I create data frame:

Python 2.6.8 (unknown, Jan 26 2013, 14:35:25) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> values = pd.Series(i for i in range(5))
>>> dates = pd.date_range('20130101',periods=5)
>>> df = pd.DataFrame({'values': values, 'dates': dates})
>>> df
/usr/local/lib/python2.6/dist-packages/pandas/core/config.py:570: DeprecationWarning: height has been deprecated.

  warnings.warn(d.msg, DeprecationWarning)
                dates  values
0 2013-01-01 00:00:00       0
1 2013-01-02 00:00:00       1
2 2013-01-03 00:00:00       2
3 2013-01-04 00:00:00       3
4 2013-01-05 00:00:00       4

It have two columns: one is datetime64 and other one is int64 dtype:

>>> df.dtypes
dates     datetime64[ns]
values             int64
dtype: object

In pandas documentation I found how to convert series to any dtypes. It looks like what I need:

>>> df['dates'].astype(object)
0    2013-01-01 00:00:00
1    2013-01-02 00:00:00
2    2013-01-03 00:00:00
3    2013-01-04 00:00:00
4    2013-01-05 00:00:00
Name: dates, dtype: object

But when I assign this series as dataframe column, I got a datetime64 dtype again.

>>> df['dates'] = df['dates'].astype(object)
>>> df.dtypes
dates     datetime64[ns]
values             int64
dtype: object

Please, help. How to convert data frame's column to object dtype? Thanks.

like image 735
ghostishev Avatar asked Oct 18 '13 11:10

ghostishev


People also ask

How do I change the datatype of a column in a data frame?

The best way to convert one or more columns of a DataFrame to numeric values is to use pandas.to_numeric() . This function will try to change non-numeric objects (such as strings) into integers or floating-point numbers as appropriate.

Can a DataFrame column have different data types?

A column in a DataFrame can only have one data type. The data type in a DataFrame's single column can be checked using dtype .

How do I change the Dtype of multiple columns?

You can use df. astype() with a dictionary for the columns you want to change with the corresponding dtype.

How do I change the datatype of a column in a DataFrame in Pyspark?

The column type can be cast or changed using the DataFrame column data type using cast() function of Column class, withColumn() and selectExpr() function.


1 Answers

If you really want to change from datatype of datetime64[ns] to object, you could run something like this:

df['dates'] = df['dates'].apply(lambda x: str(x))
print df.types # Can verify to see that dates prints out as an object
like image 175
Will Avatar answered Sep 21 '22 12:09

Will