This question is odd, since I know HOW to do something, but I dont know WHY I cant do it another way.
Suppose simple data frame:
import pandasas pd
a = pd.DataFrame([[0,1], [2,3]])
I can slice this data frame very easily, first column is a[[0]]
, second is a[[1]]
. Simple isnt it?
Now, lets have more complex data frame. This is part of my code:
var_vec = [i for i in range(100)]
num_of_sites = 100
row_names = ["_".join(["loc", str(i)]) for i in
range(1,num_of_sites + 1)]
frame = pd.DataFrame(var_vec, columns = ["Variable"], index = row_names)
spec_ab = [i**3 for i in range(100)]
frame[1] = spec_ab
Data frame frame
is also pandas DataFrame, such as a. I canget second column very easily as frame[[1]]
. But when I try frame[[0]]
I get an error:
Traceback (most recent call last):
File "<ipython-input-55-0c56ffb47d0d>", line 1, in <module>
frame[[0]]
File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python- 3.5.2.amd64\lib\site-packages\pandas\core\frame.py", line 1991, in __getitem__
return self._getitem_array(key)
File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python- 3.5.2.amd64\lib\site-packages\pandas\core\frame.py", line 2035, in _getitem_array
indexer = self.ix._convert_to_indexer(key, axis=1)
File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python- 3.5.2.amd64\lib\site-packages\pandas\core\indexing.py", line 1184, in _convert_to_indexer
indexer = labels._convert_list_indexer(objarr, kind=self.name)
File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python- 3.5.2.amd64\lib\site-packages\pandas\indexes\base.py", line 1112, in _convert_list_indexer
return maybe_convert_indices(indexer, len(self))
File "C:\Users\Robert\Desktop\Záloha\WinPython-64bit-3.5.2.2\python- 3.5.2.amd64\lib\site-packages\pandas\core\indexing.py", line 1856, in maybe_convert_indices
raise IndexError("indices are out-of-bounds")
IndexError: indices are out-of-bounds
I can still use frame.iloc[:,0]
but problem is that I dont understand why I cant use simple slicing by [[]]
? I use winpython spyder 3 if that helps.
You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.
Pandas DataFrame first() Method The first() method returns the first n rows, based on the specified value. The index have to be dates for this method to work as expected.
using your code:
import pandas as pd
var_vec = [i for i in range(100)]
num_of_sites = 100
row_names = ["_".join(["loc", str(i)]) for i in
range(1,num_of_sites + 1)]
frame = pd.DataFrame(var_vec, columns = ["Variable"], index = row_names)
spec_ab = [i**3 for i in range(100)]
frame[1] = spec_ab
if you ask to print out the 'frame' you get:
Variable 1
loc_1 0 0
loc_2 1 1
loc_3 2 8
loc_4 3 27
loc_5 4 64
loc_6 5 125
......
So the cause of your problem becomes obvious, you have no column called '0'. At line one you specify a lista called var_vec. At line 4 you make a dataframe out of that list, but you specify the index values and the column name (which is usually good practice). The numerical column name, '0', '1',.. as in the first example, only takes place when you dont specify the column name, its not a column position indexer.
If you want to access columns by their position, you can:
df[df.columns[0]]
what happens than, is you get the list of columns of the df, and you choose the term '0' and pass it to the df as a reference.
hope that helps you understand
edit:
another way (better) would be:
df.iloc[:,0]
where ":" stands for all rows. (also indexed by number from 0 to range of rows)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With