I have this code (which works) - a bunch of nested conditional statements to set the value in the 'paragenesis1'
row of a dataframe (myOxides['cpx']
), depending on the values in various other rows of the frame.
I'm very new to python and programming in general. I am thinking that I should write a function to perform this, but how then to apply that function elementwise? This is the only way I have found to avoid the 'truth value of a series is ambiguous' error.
Any help greatly appreciated!
myOxides['cpx'].loc['paragenesis1'] = np.where(
((cpxCrOx>=0.5) & (cpxAlOx<=4)),
"GtPeridA",
np.where(
((cpxCrOx>=2.25) & (cpxAlOx<=5)),
"GtPeridB",
np.where(
((cpxCrOx>=0.5)&
(cpxCrOx<=2.25)) &
((cpxAlOx>=4) & (cpxAlOx<=6)),
"SpLhzA",
np.where(
((cpxCrOx>=0.5) &
(cpxCrOx<=(5.53125 -
0.546875 * cpxAlOx))) &
((cpxAlOx>=4) &
(cpxAlOx <= ((cpxCrOx -
5.53125)/ -0.546875))),
"SpLhzB",
"Eclogite, Megacryst, Cognate"))))
or;
df.loc['a'] = np.where(
(some_condition),
"value",
np.where(
((conditon_1) & (condition_2)),
"some_value",
np.where(
((condition_3)& (condition_4)),
"some_other_value",
np.where(
((condition_5),
"another_value",
"other_value"))))
We can perform a similar operation in a pandas DataFrame by using the pandas where () function, but the syntax is slightly different. df ['col'] = (value_if_false).where(condition, value_if_true) The following example shows how to use the pandas where () function in practice.
Fortunately many of these libraries have similar syntax as Pandas hence making the learning curve less steep. Dask provides multi-core and distributed parallel execution on larger-than-memory datasets. A Dask DataFrame is a large parallel DataFrame composed of many smaller Pandas DataFrames, split along the index.
A Dask DataFrame is a large parallel DataFrame composed of many smaller Pandas DataFrames, split along the index. These Pandas DataFrames may live on disk for larger-than-memory computing on a single machine, or on many different machines in a cluster. One Dask DataFrame operation triggers many operations on the constituent Pandas DataFrames.
Creating conditional columns on Pandas with Numpy select () and where () methods 1 Step 1: Combine price lists together and set fruit column as index#N#The first step is to combine all price lists into one... 2 Step 2: Incorporate Numpy select () with Pandas DataFrame More ...
One possible solution is use numpy.select
:
m1 = (cpxCrOx>=0.5) & (cpxAlOx<=4)
m2 = (cpxCrOx>=2.25) & (cpxAlOx<=5)
m3 = ((cpxCrOx>=0.5) & (cpxCrOx<=2.25)) & ((cpxAlOx>=4) & (cpxAlOx<=6))
m4 = ((cpxCrOx>=0.5) &(cpxCrOx<=(5.53125 - 0.546875 * cpxAlOx))) & \
((cpxAlOx>=4) & (cpxAlOx <= ((cpxCrOx - 5.53125)/ -0.546875))
vals = [ "GtPeridA", "GtPeridB", "SpLhzA", "SpLhzB"]
default = 'Eclogite, Megacryst, Cognate'
myOxides['paragenesis1'] = np.select([m1,m2,m3,m4], vals, default=default)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With