I have such DataFrame: <pre class="prettyprint"><code>df = pd.DataFrame(data={ 'col0': [11, 22,1, 5] 'col1': ['aa:a:aaa', 'a:a', 'a', 'a:aa:a:aaa'], 'col2': ["foo", "foo", "foobar", "bar"], 'col3': [True, False, True, False], 'col4': ['elo', 'foo', 'bar', 'dupa']}) </code></pre> I want to get length of the list after split on ":" in col1, then I want to overwrite the values if length > 2 OR not overwrite the values if length <= 2. Ideally, in one line as fast as possible. Currently, I try but it returns ValueError. <pre class="prettyprint"><code>df[['col1', 'col2', 'col3']] = df.loc[df['col1'].str.split(":").apply(len) > 2], ("", "", False), df[['col1', 'col2', 'col3']]) </code></pre> EDIT: condition on col1. EDIT2: thank you for all the great and quickly provided answers. amazing! EDIT3: timing on 10^6 rows: @ansev 3.2657s @jezrael 0.8922s @anky_91 1.9511s

You need <code>series.str.len()</code> after splitting to determining the length of the list , then you can compare and using <code>.loc[]</code> , assign the the list wherever condition matches: <pre class="prettyprint"><code>df.loc[df['col1'].str.split(":").str.len()>2,['col1','col2','col3']]=["", "", False] print(df) </code></pre> <hr> <pre class="prettyprint"><code> col0 col1 col2 col3 col4 0 11 False elo 1 22 a:a foo False foo 2 1 a foobar True bar 3 5 False dupa </code></pre>

Use <code>Series.str.count</code>, add <code>1</code>, compare by <code>Series.gt</code> and assign list to filtered columns in list: <pre class="prettyprint"><code>df.loc[df['col1'].str.count(":").add(1).gt(2), ['col1','col2','col3']] = ["", "", False] print (df) col0 col1 col2 col3 col4 0 11 False elo 1 22 a:a foo False foo 2 1 a foobar True bar 3 5 False dupa </code></pre>

Another approach is <code>Series.str.split</code> with <code>expand = True</code> and <code>DataFrame.count</code> with <code>axis=1</code>. <pre class="prettyprint"><code>df.loc[df['col1'].str.split(":",expand = True).count(axis=1).gt(2),['col1','col2','col3']]=["", "", False] print(df) col0 col1 col2 col3 col4 0 11 False elo 1 22 a:a foo False foo 2 1 a foobar True bar 3 5 False dupa </code></pre>

pandas overwrite values in multiple columns at once based on condition of values in one column

Q: How replace values in column based on multiple conditions in pandas?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.

Q: Can I use ILOC and LOC together?

loc and iloc are interchangeable when labels are 0-based integers.

I have such DataFrame:

df = pd.DataFrame(data={
    'col0': [11, 22,1, 5]
    'col1': ['aa:a:aaa', 'a:a', 'a', 'a:aa:a:aaa'],
    'col2': ["foo", "foo", "foobar", "bar"],
    'col3': [True, False, True, False],
    'col4': ['elo', 'foo', 'bar', 'dupa']})

I want to get length of the list after split on ":" in col1, then I want to overwrite the values if length > 2 OR not overwrite the values if length <= 2.

Ideally, in one line as fast as possible.

Currently, I try but it returns ValueError.

df[['col1', 'col2', 'col3']] = df.loc[df['col1'].str.split(":").apply(len) > 2], ("", "", False), df[['col1', 'col2', 'col3']])

EDIT: condition on col1. EDIT2: thank you for all the great and quickly provided answers. amazing! EDIT3: timing on 10^6 rows:

@ansev 3.2657s

@jezrael 0.8922s

@anky_91 1.9511s

How replace values in column based on multiple conditions in pandas?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.

Can I use ILOC and LOC together?

loc and iloc are interchangeable when labels are 0-based integers.

You need series.str.len() after splitting to determining the length of the list , then you can compare and using .loc[] , assign the the list wherever condition matches:

df.loc[df['col1'].str.split(":").str.len()>2,['col1','col2','col3']]=["", "", False]
print(df)

   col0 col1    col2   col3  col4
0    11               False   elo
1    22  a:a     foo  False   foo
2     1    a  foobar   True   bar
3     5               False  dupa

Use Series.str.count, add 1, compare by Series.gt and assign list to filtered columns in list:

df.loc[df['col1'].str.count(":").add(1).gt(2), ['col1','col2','col3']] = ["", "", False]
print (df)
   col0 col1    col2   col3  col4
0    11               False   elo
1    22  a:a     foo  False   foo
2     1    a  foobar   True   bar
3     5               False  dupa

Another approach is Series.str.split with expand = True and DataFrame.count with axis=1.

df.loc[df['col1'].str.split(":",expand = True).count(axis=1).gt(2),['col1','col2','col3']]=["", "", False]
print(df)
   col0 col1    col2   col3  col4
0    11               False   elo
1    22  a:a     foo  False   foo
2     1    a  foobar   True   bar
3     5               False  dupa

pandas overwrite values in multiple columns at once based on condition of values in one column

Tags:

python

pandas

apply

Dariusz Krynicki

People also ask

3 Answers

anky

jezrael

ansev

Recent Activity

Donate For Us

pandas overwrite values in multiple columns at once based on condition of values in one column

Tags:

python

pandas

apply

Dariusz Krynicki

People also ask

3 Answers

anky

jezrael

ansev

Related questions

Recent Activity

Donate For Us