Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

np.where Not Working in my Pandas

I have an np.where problem using Pandas that is driving me crazy and I can't seem to solve through Google, the documentation, etc.

I'm hoping someone has insight. I'm sure it isn't complex.

I have a df where I'm checking the value in one column - and if that value is 'n/a' (as a string, not as in .isnull()), changing it to another value.

Full_Names_Test_2['MarketCap'] == 'n/a'

returns:

70      True
88     False
90      True
145     True
156     True
181     True
191     True
200     True
219     True
223    False
Name: MarketCap, dtype: bool

so that part works.

but this:

Full_Names_Test_2['NewColumn'] = np.where(Full_Names_Test_2['MarketCap'] == 'n/a', 7)

returns:

ValueError: either both or neither of x and y should be given

What is going on?

like image 651
Windstorm1981 Avatar asked Oct 21 '15 18:10

Windstorm1981


People also ask

Does NumPy work with Pandas?

NumPy is an open-source Python library that facilitates efficient numerical operations on large quantities of data. There are a few functions that exist in NumPy that we use on pandas DataFrames. For us, the most important part about NumPy is that pandas is built on top of it. So, NumPy is a dependency of Pandas.

Can we use nested NP Where?

We can use nested np. where() condition checks ( like we do for CASE THEN condition checking in other languages).

What does .values do in Pandas?

Definition and Usage The values property returns all values in the DataFrame. The return value is a 2-dimensional array with one array for each row.

Is Panda faster than NP?

NumPy performs better than Pandas for 50K rows or less. But, Pandas' performance is better than NumPy's for 500K rows or more. Thus, performance varies between 50K and 500K rows depending on the type of operation.


1 Answers

You need to pass the boolean mask and the (two) values columns:

np.where(Full_Names_Test_2['MarketCap'] == 'n/a', 7)
# should be
np.where(Full_Names_Test_2['MarketCap'] == 'n/a', Full_Names_Test_2['MarketCap'], 7)

See the np.where docs.

or alternatively use the where Series method:

Full_Names_Test_2['MarketCap'].where(Full_Names_Test_2['MarketCap'] == 'n/a', 7)
like image 162
Andy Hayden Avatar answered Sep 28 '22 07:09

Andy Hayden