Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save Pandas df containing long list as csv file

Tags:

python

pandas

csv

I am trying to save a pandas dataframe as .csv file. Currently my code looks like this:

with open('File.csv', 'a') as f:
        df.to_csv(f, header=False)

The saving works but the problem is that the lists in my dataframe are just compressed to [first,second,...,last] and all the entries in the middle are discarded. If I just look at the original dataframe all entries are there. Is there any way how I can convert the list to a string which contains all the elements (str(df) also discards the middle elements) or how I can save a full numpy array in a cell of a csv table?

Thank you for your help, Viviane

like image 718
thebear Avatar asked Dec 17 '17 18:12

thebear


People also ask

How do I save a pandas list to a CSV file?

Using Pandas to_csv() function To convert the list to csv, we need to convert from list to dataframe and then use the to_csv() function to convert dataframe to a csv file. In this example, we have first imported pandas library and then define the four lists and map it with its column using a dictionary.

What does To_csv do in pandas?

Pandas DataFrame to_csv() function converts DataFrame into CSV data. We can pass a file object to write the CSV data into a file. Otherwise, the CSV data is returned in the string format.

Can you store a list in a pandas DataFrame?

You can insert a list of values into a cell in Pandas DataFrame using DataFrame.at() , DataFrame. iat() , and DataFrame.


2 Answers

I had issues while saving dataframes too. I had a dataframe in which some columns consisted of lists as its elements. When I saved the datfarme using df.to_csv and then read it from disk using df.read_csv, the list and arrays were turned into a string of characters. Hence [1,2,3] was transformed to '[1,2,3]'. When I used HDF5 format the problem was solved.

If you dataframe is called df_temp, then you can use:

store = pd.HDFStore('store.h5')
store['df'] = df_temp

to save the dataframe in HDF5 format and you can read it using the following command:

store = pd.HDFStore('store.h5')
df_temp_read = store['df']

You can look at this answer. I should also mention that pickle did not work for me, since I lost the column names when reading from the file. Maybe I did something wrong, but apart from that, pickle can cause compatibility issues if you plan to read the file in different python versions.

like image 198
CrossEntropy Avatar answered Sep 19 '22 05:09

CrossEntropy


Your code should work properly. I couldn't reproduce described behavior.

Here is a bit more "pandaic" version:

df.to_csv('File.csv', header=False, mode='a')

PS pay attention at the mode='a' (append) parameter...

UPDATE:

How to get rid of ellipsis when displaying / printing a DF:

with pd.option_context("display.max_columns", 0):
    print(df)
like image 40
MaxU - stop WAR against UA Avatar answered Sep 23 '22 05:09

MaxU - stop WAR against UA