I have a pandas DataFrame named df. With df.dtypes
I can print on screen:
arrival_time object
departure_time object
drop_off_type int64
extra object
pickup_type int64
stop_headsign object
stop_id object
stop_sequence int64
trip_id object
dtype: object
I want to save this information so that I can compare it with other data, type-cast things elsewhere, etc. I want to save it into to a local file, recover it elsewhere in another program where the data can't go. But I'm not able to figure out how. Showing the results of various conversions.
df.dtypes.to_dict()
{'arrival_time': dtype('O'),
'departure_time': dtype('O'),
'drop_off_type': dtype('int64'),
'extra': dtype('O'),
'pickup_type': dtype('int64'),
'stop_headsign': dtype('O'),
'stop_id': dtype('O'),
'stop_sequence': dtype('int64'),
'trip_id': dtype('O')}
----
df.dtypes.to_json()
'{"arrival_time":{"alignment":4,"byteorder":"|","descr":[["","|O"]],"flags":63,"isalignedstruct":false,"isnative":true,"kind":"O","name":"object","ndim":0,"num":17,"str":"|O"},"departure_time":{"alignment":4,"byteorder":"|","descr":[["","|O"]],"flags":63,"isalignedstruct":false,"isnative":true,"kind":"O","name":"object","ndim":0,"num":17,"str":"|O"},"drop_off_type":{"alignment":4,"byteorder":"=","descr":[["","<i8"]],"flags":0,"isalignedstruct":false,"isnative":true,"kind":"i","name":"int64","ndim":0,"num":9,"str":"<i8"},"extra":{"alignment":4,"byteorder":"|","descr":[["","|O"]],"flags":63,"isalignedstruct":false,"isnative":true,"kind":"O","name":"object","ndim":0,"num":17,"str":"|O"},"pickup_type":{"alignment":4,"byteorder":"=","descr":[["","<i8"]],"flags":0,"isalignedstruct":false,"isnative":true,"kind":"i","name":"int64","ndim":0,"num":9,"str":"<i8"},"stop_headsign":{"alignment":4,"byteorder":"|","descr":[["","|O"]],"flags":63,"isalignedstruct":false,"isnative":true,"kind":"O","name":"object","ndim":0,"num":17,"str":"|O"},"stop_id":{"alignment":4,"byteorder":"|","descr":[["","|O"]],"flags":63,"isalignedstruct":false,"isnative":true,"kind":"O","name":"object","ndim":0,"num":17,"str":"|O"},"stop_sequence":{"alignment":4,"byteorder":"=","descr":[["","<i8"]],"flags":0,"isalignedstruct":false,"isnative":true,"kind":"i","name":"int64","ndim":0,"num":9,"str":"<i8"},"trip_id":{"alignment":4,"byteorder":"|","descr":[["","|O"]],"flags":63,"isalignedstruct":false,"isnative":true,"kind":"O","name":"object","ndim":0,"num":17,"str":"|O"}}'
----
json.dumps( df.dtypes.to_dict() )
...
TypeError: dtype('O') is not JSON serializable
----
list(xdf.dtypes)
[dtype('O'),
dtype('O'),
dtype('int64'),
dtype('O'),
dtype('int64'),
dtype('O'),
dtype('O'),
dtype('int64'),
dtype('O')]
How to save and export/archive dtype information of a pandas DataFrame?
To check the data type in pandas DataFrame we can use the “dtype” attribute. The attribute returns a series with the data type of each column. And the column names of the DataFrame are represented as the index of the resultant series object and the corresponding data types are returned as values of the series object.
Pandas DataFrame info() MethodThe info() method prints information about the DataFrame. The information contains the number of columns, column labels, column data types, memory usage, range index, and the number of cells in each column (non-null values). Note: the info() method actually prints the info.
Exporting the DataFrame into a CSV filePandas DataFrame to_csv() function exports the DataFrame to CSV format. If a file argument is provided, the output will be the CSV file. Otherwise, the return value is a CSV format like string. sep: Specify a custom delimiter for the CSV output, the default is a comma.
pd.DataFrame.dtypes
returns a pd.Series
object. This means you can manipulate it as you would any regular series in Pandas:
df = pd.DataFrame({'A': [''], 'B': [1.0], 'C': [1], 'D': [True]})
res = df.dtypes.to_frame('dtypes').reset_index()
print(res)
index dtypes
0 A object
1 B float64
2 C int64
3 D bool
Output to csv / excel / pickle
You can then use any method you normally would to store a dataframe, such as to_csv
, to_excel
, to_pickle
, etc. Note for distribution pickle is not recommended as it is version dependent.
Output to json
If you wish to easily store and load as a dictionary, a popular format is json
. As you found, you need to convert to str
type first:
import json
# first create dictionary
d = res.set_index('index')['dtypes'].astype(str).to_dict()
with open('types.json', 'w') as f:
json.dump(d, f)
with open('types.json', 'r') as f:
data_types = json.load(f)
print(data_types)
{'A': 'object', 'B': 'float64', 'C': 'int64', 'D': 'bool'}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With