I would like to slice a DataFrame with a Boolean index obtaining a copy, and then do stuff on that copy independently of the original DataFrame.
Judging from this answer, selecting with .loc
using a Boolean array will hand me back a copy, but then, if I try to change the copy, SettingWithCopyWarning
gets in the way. Would this then be the correct way:
import numpy as np
import pandas as pd
d1 = pd.DataFrame(np.random.randn(10, 5), columns=['a', 'b', 'c', 'd', 'e'])
# create a new dataframe from the sliced copy
d2 = pd.DataFrame(d1.loc[d1.a > 1, :])
# do stuff with d2, keep d1 unchanged
append() function is used to append rows of other dataframe to the end of the given dataframe, returning a new dataframe object. Columns not in the original dataframes are added as new columns and the new cells are populated with NaN value. Parameters: other : DataFrame or Series/dict-like object, or list of these.
You need copy
with boolean indexing
, new DataFrame
constructor is not necessary:
d2 = d1[d1.a > 1].copy()
Explanation of warning:
If you modify values in d2
later you will find that the modifications do not propagate back to the original data (d1
), and that Pandas does warning.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With