I just started working with the pandas library. Despite my research, I still haven't figured it out. I want to pull the data for the column named q. But it gives an error. How can I do that?
import pandas as pd
data = pd.read_excel('test1.xlsx')
df = pd.DataFrame(data)
print(df.loc[df['q']])
Error:
Traceback (most recent call last):
File "c:/Users/sabca/visual studio code projects/webscraping/pandastest.py", line 11,
in <module>
print(df.loc[df['q']])
File "C:\Users\sabca\AppData\Local\Programs\Python\Python38\lib\site-
packages\pandas\core\indexing.py", line 879, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "C:\Users\sabca\AppData\Local\Programs\Python\Python38\lib\site-
packages\pandas\core\indexing.py", line 1099, in _getitem_axis
return self._getitem_iterable(key, axis=axis)
File "C:\Users\sabca\AppData\Local\Programs\Python\Python38\lib\site-
packages\pandas\core\indexing.py", line 1037, in _getitem_iterable
keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
File "C:\Users\sabca\AppData\Local\Programs\Python\Python38\lib\site-
packages\pandas\core\indexing.py", line 1254, in _get_listlike_indexer
self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
File "C:\Users\sabca\AppData\Local\Programs\Python\Python38\lib\site-
packages\pandas\core\indexing.py", line 1298, in _validate_read_indexer
raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Index(['qwe1', 'asdf1', 'adfs4', 'wer7', 'tyu1', 'ghfhg5'],
dtype='object')] are in the [index]"
Fixing the data
/df
confusion
Firstly, there's really no need for the line
df = pd.DataFrame(data)
As data
is already a Pandas DataFrame as returned by the pd.read_excel
function.
Instead I would suggest ommiting this line and going simply with the following (I will be using df
to refer to the Pandas DataFrame generated using this function for the remainder of this answer).
df = pd.read_excel('test1.xlsx')
Returning a Pandas Series from column q
Assuming that q
is the name of a column in your df
then:
df['q']
Will return a Pandas Series representing the column q
.
If you wanted to use df.loc
this indexing method takes a range of rows to be returned as the first item and an optional range of columns as the second. Assuming you are seeking to return all rows of the column q
you could use.
df.loc[:, 'q']
Returning a Numpy array of values from column q
You could use:
df['q'].values
To return a Numpy array containing the values stored within the q
column.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With