I am creating a <code>dataframe</code> from a CSV file. I have gone through the docs, multiple SO posts, links as I have just started Pandas but didn't get it. The CSV file has multiple columns with same names say <code>a</code>. So after forming <code>dataframe</code> and when I do <code>df['a']</code> which value will it return? It does not return all values. Also only one of the values will have a string rest will be <code>None</code>. How can I get that column?

the relevant parameter is <code>mangle_dupe_cols</code> from the docs <blockquote> <pre class="prettyprint"><code>mangle_dupe_cols : boolean, default True Duplicate columns will be specified as 'X.0'...'X.N', rather than 'X'...'X' </code></pre> </blockquote> by default, all of your <code>'a'</code> columns get named <code>'a.0'...'a.N'</code> as specified above. if you used <code>mangle_dupe_cols=False</code>, importing this <code>csv</code> would produce an error. you can get all of your columns with <pre class="prettyprint"><code>df.filter(like='a') </code></pre> <hr> demonstration <pre class="prettyprint"><code>from StringIO import StringIO import pandas as pd txt = """a, a, a, b, c, d 1, 2, 3, 4, 5, 6 7, 8, 9, 10, 11, 12""" df = pd.read_csv(StringIO(txt), skipinitialspace=True) df </code></pre> <img src="https://i.stack.imgur.com/iQhUw.png" alt="enter image description here"> <pre class="prettyprint"><code>df.filter(like='a') </code></pre> <img src="https://i.stack.imgur.com/1jhmQ.png" alt="enter image description here">

Multiple columns with the same name in Pandas

Also only one of the values will have a string rest will be None. How can I get that column?

267

asked Oct 11 '16 21:10

vks

1 Answers

the relevant parameter is mangle_dupe_cols

from the docs

mangle_dupe_cols : boolean, default True
    Duplicate columns will be specified as 'X.0'...'X.N', rather than 'X'...'X'

by default, all of your 'a' columns get named 'a.0'...'a.N' as specified above.

if you used mangle_dupe_cols=False, importing this csv would produce an error.

you can get all of your columns with

df.filter(like='a')

demonstration

from StringIO import StringIO
import pandas as pd

txt = """a, a, a, b, c, d
1, 2, 3, 4, 5, 6
7, 8, 9, 10, 11, 12"""

df = pd.read_csv(StringIO(txt), skipinitialspace=True)
df

enter image description here

df.filter(like='a')

enter image description here

answered Oct 22 '22 04:10

piRSquared

Related questions
                            
                                Live updating only the data in Dash/plotly
                            
                                Poisson Regression in statsmodels and R
                            
                                What is the n parameter of tkinter.mainloop function?
                            
                                Graph disconnected: cannot obtain value for tensor Tensor Input Keras Python
                            
                                Distribute pre-compiled python extension module with distutils
                            
                                Send Ctrl-C to remote processes started via subprocess.Popen and ssh
                            
                                Using git to manage virtualenv state: will this cause problems?
                            
                                python multiprocessing arguments: deep copy?
                            
                                `DummyExecutor` for Python's `futures`
                            
                                How to use SQLAlchemy to seamlessly access multiple databases?
                            
                                Making pyplot.hist() first and last bins include outliers
                            
                                Django: how to set log level to INFO or DEBUG
                            
                                Why I am suddenly seeing `Usage: source deactivate` whenever I run virtualenvwrapper commands?
                            
                                How can I restrict the scope of a multiprocessing process?
                            
                                Python multiprocessing within mpi
                            
                                "No such file or directory" from os.mkdir
                            
                                What's the best way to refresh TensorBoard after new events/logs were added?
                            
                                python equality precedence
                            
                                Psycopg2 Python SSL Support is not compiled in
                            
                                Concatenate (join) a NumPy array with a pandas DataFrame

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Multiple columns with the same name in Pandas

Tags:

python

pandas

csv

python-2.7

vks

People also ask

1 Answers

piRSquared

Recent Activity

Donate For Us