I want to get a list of the column headers from a Pandas DataFrame. The DataFrame will come from user input, so I won't know how many columns there will be or what they will be called. For example, if I'm given a DataFrame like this: <pre class="prettyprint"><code>>>> my_dataframe y gdp cap 0 1 2 5 1 2 3 9 2 8 7 2 3 3 4 7 4 6 7 7 5 4 8 3 6 8 2 8 7 9 9 10 8 6 6 4 9 10 10 7 </code></pre> I would get a list like this: <pre class="prettyprint"><code>>>> header_list ['y', 'gdp', 'cap'] </code></pre>

You can get the values as a list by doing: <pre class="prettyprint"><code>list(my_dataframe.columns.values) </code></pre> Also you can simply use (as shown in Ed Chum's answer): <pre class="prettyprint"><code>list(my_dataframe) </code></pre>

There is a built-in method which is the most performant: <pre class="prettyprint"><code>my_dataframe.columns.values.tolist() </code></pre> <code>.columns</code> returns an <code>Index</code>, <code>.columns.values</code> returns an array and this has a helper function <code>.tolist</code> to return a list. If performance is not as important to you, <code>Index</code> objects define a <code>.tolist()</code> method that you can call directly: <pre class="prettyprint"><code>my_dataframe.columns.tolist() </code></pre> The difference in performance is obvious: <pre class="prettyprint lang-none prettyprint-override"><code>%timeit df.columns.tolist() 16.7 µs ± 317 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) %timeit df.columns.values.tolist() 1.24 µs ± 12.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) </code></pre> <hr> For those who hate typing, you can just call <code>list</code> on <code>df</code>, as so: <pre class="prettyprint"><code>list(df) </code></pre>

Get a list from Pandas DataFrame column headers

Tags:

python

pandas

dataframe

I want to get a list of the column headers from a Pandas DataFrame. The DataFrame will come from user input, so I won't know how many columns there will be or what they will be called.

For example, if I'm given a DataFrame like this:

>>> my_dataframe     y  gdp  cap 0   1    2    5 1   2    3    9 2   8    7    2 3   3    4    7 4   6    7    7 5   4    8    3 6   8    2    8 7   9    9   10 8   6    6    4 9  10   10    7

I would get a list like this:

>>> header_list ['y', 'gdp', 'cap']

341

asked Oct 20 '13 21:10

natsuki_2002

2 Answers

You can get the values as a list by doing:

list(my_dataframe.columns.values)

Also you can simply use (as shown in Ed Chum's answer):

list(my_dataframe)

163

answered Oct 13 '22 16:10

Simeon Visser

There is a built-in method which is the most performant:

my_dataframe.columns.values.tolist()

.columns returns an Index, .columns.values returns an array and this has a helper function .tolist to return a list.

If performance is not as important to you, Index objects define a .tolist() method that you can call directly:

my_dataframe.columns.tolist()

The difference in performance is obvious:

%timeit df.columns.tolist() 16.7 µs ± 317 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)  %timeit df.columns.values.tolist() 1.24 µs ± 12.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

For those who hate typing, you can just call list on df, as so:

list(df)

answered Oct 13 '22 16:10

EdChum

Related questions
                            
                                Running shell command and capturing the output
                            
                                How to copy a dictionary and only edit the copy
                            
                                Best way to return multiple values from a function? [closed]
                            
                                How to move a file in Python?
                            
                                How does the @property decorator work in Python?
                            
                                How to get the ASCII value of a character
                            
                                How do I check if a variable exists?
                            
                                How do I find the location of my Python site-packages directory?
                            
                                Relative imports for the billionth time
                            
                                How to get line count of a large file cheaply in Python?
                            
                                How to read a text file into a string variable and strip newlines?
                            
                                Does Django scale? [closed]
                            
                                Relative imports in Python 3
                            
                                Create a Pandas Dataframe by appending one row at a time
                            
                                Why do people write #!/usr/bin/env python on the first line of a Python script?
                            
                                How to reverse a list?
                            
                                How can I sort a dictionary by key?
                            
                                How to add a new column to an existing DataFrame?
                            
                                If Python is interpreted, what are .pyc files?
                            
                                Is there a built-in function to print all the current properties and values of an object?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With