I would like to write some comments in my CSV file created with <code>pandas</code>. I haven't found any option for this in <code>DataFrame.to_csv</code> (even though <code>read_csv</code> can skip comments) neither in the standard <code>csv</code> module. I can open the file, write the comments (line starting with <code>#</code>) and then pass it to <code>to_csv</code>. Does any body have a better option?

<code>df.to_csv</code> accepts a file object. So you can open a file in <code>a</code> mode, write you comments and pass it to the dataframe to_csv function. For example: <pre class="prettyprint"><code>In [36]: df = pd.DataFrame({'a':[1,2,3], 'b':[1,2,3]}) In [37]: f = open('foo', 'a') In [38]: f.write('# My awesome comment\n') In [39]: f.write('# Here is another one\n') In [40]: df.to_csv(f) In [41]: f.close() In [42]: more foo # My awesome comment # Here is another one ,a,b 0,1,1 1,2,2 2,3,3 </code></pre>

Write comments in CSV file with pandas

Tags:

python

pandas

export-to-csv

I would like to write some comments in my CSV file created with pandas. I haven't found any option for this in DataFrame.to_csv (even though read_csv can skip comments) neither in the standard csv module. I can open the file, write the comments (line starting with #) and then pass it to to_csv. Does any body have a better option?

455

asked Mar 24 '15 13:03

Mathieu Dubois

2 Answers

df.to_csv accepts a file object. So you can open a file in a mode, write you comments and pass it to the dataframe to_csv function.

For example:

In [36]: df = pd.DataFrame({'a':[1,2,3], 'b':[1,2,3]})  In [37]: f = open('foo', 'a')  In [38]: f.write('# My awesome comment\n')  In [39]: f.write('# Here is another one\n')  In [40]: df.to_csv(f)  In [41]: f.close()  In [42]: more foo # My awesome comment # Here is another one ,a,b 0,1,1 1,2,2 2,3,3

125

answered Sep 19 '22 02:09

Vor

An alternative approach @Vor's solution is to first write the comment to a file, and then use mode='a' with to_csv() to add the content of the data frame to the same file. According to my benchmarks (below), this takes about as long as opening the file in append mode, adding the comment and then passing the file handler to pandas (as per @Vor's answer). The similar timings make sense considering that this is what pandas in doing internally (DataFrame.to_csv() calls CSVFormatter.save(), which uses _get_handles() to read in the file via open().

On a separate note, it is convenient work with file IO via with statement which ensures that opened files close when you're done with them and leave the with statement. See examples in the benchmarks below.

Read in test data

import pandas as pd # Read in the iris data frame from the seaborn GitHub location iris = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv') # Create a bigger data frame while iris.shape[0] < 100000:     iris = iris.append(iris) # `iris.shape` is now (153600, 5)

1. Append with the same file handler

%%timeit -n 5 -r 5  # Open a file in append mode to add the comment # Then pass the file handle to pandas with open('test1.csv', 'a') as f:     f.write('# This is my comment\n')     iris.to_csv(f)

972 ms ± 31.9 ms per loop (mean ± std. dev. of 5 runs, 5 loops each)

2. Reopen the file with `to_csv(mode='a')`

%%timeit -n 5 -r 5  # Open a file in write mode to add the comment # Then close the file and reopen it with pandas in append mode with open('test2.csv', 'w') as f:     f.write('# This is my comment\n') iris.to_csv('test2.csv', mode='a')

949 ms ± 19.3 ms per loop (mean ± std. dev. of 5 runs, 5 loops each)

answered Sep 18 '22 02:09

joelostblom

Related questions
                            
                                how to make hollow square marks with matplotlib in python
                            
                                Set initial value to modelform in class based generic views
                            
                                Decompress bz2 files
                            
                                Django-queryset join without foreignkey
                            
                                I get an Error 400: Bad Request on custom Heroku domain, but works fine on foo.herokuapp.com
                            
                                Interactive pixel information of an image in Python?
                            
                                UDP Client/Server Socket in Python
                            
                                Understanding execute async script in Selenium
                            
                                How to convert string to datetime with nulls - python, pandas?
                            
                                determine OS distribution of a docker image
                            
                                How to add a new entry into a dictionary object while using jinja2?
                            
                                Expected view to be called with a URL keyword argument named "pk"
                            
                                Cannot understand numpy argpartition output
                            
                                Use functools' @lru_cache without specifying maxsize parameter
                            
                                AttributeError: 'str' object has no attribute 'decode' in fitting Logistic Regression Model
                            
                                List comprehension for loops Python
                            
                                Why always add self as first argument to class methods? [duplicate]
                            
                                Create file but if name exists add number
                            
                                Cygwin gcc issue - cannot find Python.h
                            
                                PySpark Drop Rows

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Write comments in CSV file with pandas

Tags:

python

pandas

export-to-csv

Mathieu Dubois

People also ask

2 Answers

Vor

Read in test data

1. Append with the same file handler

2. Reopen the file with `to_csv(mode='a')`

joelostblom

Recent Activity

Donate For Us

Write comments in CSV file with pandas

Tags:

python

pandas

export-to-csv

Mathieu Dubois

People also ask

2 Answers

Vor

Read in test data

1. Append with the same file handler

2. Reopen the file with to_csv(mode='a')

joelostblom

Related questions

Recent Activity

Donate For Us

2. Reopen the file with `to_csv(mode='a')`