Get Pandas DataFrame first column

Tags:

pandas

This question is odd, since I know HOW to do something, but I dont know WHY I cant do it another way.

Suppose simple data frame:

import pandasas pd
a = pd.DataFrame([[0,1], [2,3]])

I can slice this data frame very easily, first column is a[[0]], second is a[[1]]. Simple isnt it?

Now, lets have more complex data frame. This is part of my code:

var_vec = [i for i in range(100)]
num_of_sites = 100
row_names = ["_".join(["loc", str(i)]) for i in 
             range(1,num_of_sites + 1)]
frame = pd.DataFrame(var_vec, columns = ["Variable"], index = row_names)
spec_ab = [i**3 for i in range(100)]
frame[1] = spec_ab

Data frame frame is also pandas DataFrame, such as a. I canget second column very easily as frame[[1]]. But when I try frame[[0]] I get an error:

Traceback (most recent call last):

  File "<ipython-input-55-0c56ffb47d0d>", line 1, in <module>
    frame[[0]]

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\core\frame.py", line 1991, in __getitem__
    return self._getitem_array(key)

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\core\frame.py", line 2035, in     _getitem_array
    indexer = self.ix._convert_to_indexer(key, axis=1)

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\core\indexing.py", line 1184, in     _convert_to_indexer
    indexer = labels._convert_list_indexer(objarr, kind=self.name)

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\indexes\base.py", line 1112, in     _convert_list_indexer
    return maybe_convert_indices(indexer, len(self))

  File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python-    3.5.2.amd64\lib\site-packages\pandas\core\indexing.py", line 1856, in     maybe_convert_indices
    raise IndexError("indices are out-of-bounds")

IndexError: indices are out-of-bounds

I can still use frame.iloc[:,0] but problem is that I dont understand why I cant use simple slicing by [[]]? I use winpython spyder 3 if that helps.

503

asked Jan 31 '17 10:01

1 Answers

using your code:

import pandas as pd

var_vec = [i for i in range(100)]
num_of_sites = 100
row_names = ["_".join(["loc", str(i)]) for i in 
             range(1,num_of_sites + 1)]
frame = pd.DataFrame(var_vec, columns = ["Variable"], index = row_names)
spec_ab = [i**3 for i in range(100)]
frame[1] = spec_ab

if you ask to print out the 'frame' you get:

    Variable    1
loc_1   0       0
loc_2   1       1
loc_3   2       8
loc_4   3       27
loc_5   4       64
loc_6   5       125
......

So the cause of your problem becomes obvious, you have no column called '0'. At line one you specify a lista called var_vec. At line 4 you make a dataframe out of that list, but you specify the index values and the column name (which is usually good practice). The numerical column name, '0', '1',.. as in the first example, only takes place when you dont specify the column name, its not a column position indexer.

If you want to access columns by their position, you can:

df[df.columns[0]]

what happens than, is you get the list of columns of the df, and you choose the term '0' and pass it to the df as a reference.

hope that helps you understand

edit:

another way (better) would be:

df.iloc[:,0]

where ":" stands for all rows. (also indexed by number from 0 to range of rows)

164

answered Oct 10 '22 08:10

epattaro

Related questions
                            
                                How to map a series of conditions as keys in a dictionary?
                            
                                'numpy.ndarray' object has no attribute 'remove'
                            
                                Dictionary comprehension with inline functions
                            
                                How to print function arguments in sys.settrace?
                            
                                Spark using PySpark read images
                            
                                pandas create a series with n elements (sequential or randbetween)
                            
                                Tensorflow error using my own data
                            
                                Reconcile np.fromiter and multidimensional arrays in Python
                            
                                Testing matplotlib-based plots in Travis CI
                            
                                Python: format string with custom delimiters [duplicate]
                            
                                Can we have Django DateTimeField without timezone?
                            
                                python double colon with -1 as third parameter [duplicate]
                            
                                Keyboard shortcuts with tkinter in Python 3
                            
                                Django REST Framework - Set request in serializer test?
                            
                                python subprocess.Popen hanging
                            
                                Is there a Python equivalent to the C# ?. and ?? operators?
                            
                                pandas get average of a groupby
                            
                                How to convert JSON data into a tree image?
                            
                                Why do Tkinter's Radio Buttons all Start Selected When Using StringVar but not IntVar?
                            
                                Pillow - Resizing a GIF

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Get Pandas DataFrame first column

Tags:

python

pandas

Bobesh

People also ask

1 Answers

epattaro

Recent Activity

Donate For Us