I know this is a simple question, but I'm very new to Pandas. I want to compare for each row the cells to see if any of the cells in the columns are more or less than 0.00.
GOOG AAPL XOM IBM Value
2011-01-10 0.0 0.0 0.0 0.0 0.00
2011-01-13 0.0 -1500.0 0.0 4000.0 -61900.00
I know that pandas have built in the iterrows. However, with the following piece of code I'm receiving an error
for index, row in dataFrame.iterrows():
for i in range(0, len(of_columns)):
print dataFrame[index][i]
Error
return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas\index.pyx", line 132, in pandas.index.IndexEngine.get_loc (pandas\index.c:4433) File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:4279) File "pandas\src\hashtable_class_helper.pxi", line 732, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13742) File "pandas\src\hashtable_class_helper.pxi", line 740, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13696)
Intended action If the cell contains 0, do nothing (continue). If the cell contains other than zero, then give count of non zero values per row
Compare by gt
(>
), lt
(<
) or le
,
ge
,
ne
,
eq
first and then sum
True
s, there are processing like 1
:
Bad -> check all previous columns:
df['> zero'] = df.gt(0).sum(axis=1)
df['< zero'] = df.lt(0).sum(axis=1)
df['== zero'] = df.eq(0).sum(axis=1)
print (df)
GOOG AAPL XOM IBM Value > zero < zero == zero
2011-01-10 0.0 0.0 0.0 0.0 0.0 0 0 7
2011-01-13 0.0 -1500.0 0.0 4000.0 -61900.0 1 2 2
Correct - select columns for check:
cols = df.columns
df['> zero'] = df[cols].gt(0).sum(axis=1)
df['< zero'] = df[cols].lt(0).sum(axis=1)
df['== zero'] = df[cols].eq(0).sum(axis=1)
print (df)
GOOG AAPL XOM IBM Value > zero < zero == zero
2011-01-10 0.0 0.0 0.0 0.0 0.0 0 0 5
2011-01-13 0.0 -1500.0 0.0 4000.0 -61900.0 1 2 2
Detail:
print (df.gt(0))
GOOG AAPL XOM IBM Value
2011-01-10 False False False False False
2011-01-13 False False False True False
EDIT:
To remove some columns from the 'cols' use difference
:
cols = df.columns.difference(['Value'])
print (cols)
Index(['AAPL', 'GOOG', 'IBM', 'XOM'], dtype='object')
df['> zero'] = df[cols].gt(0).sum(axis=1)
df['< zero'] = df[cols].lt(0).sum(axis=1)
df['== zero'] = df[cols].eq(0).sum(axis=1)
print (df)
GOOG AAPL XOM IBM Value > zero < zero == zero
2011-01-10 0.0 0.0 0.0 0.0 0.0 0 0 4
2011-01-13 0.0 -1500.0 0.0 4000.0 -61900.0 1 1 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With