Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slicing a Pandas DataFrame into a new DataFrame

I would like to slice a DataFrame with a Boolean index obtaining a copy, and then do stuff on that copy independently of the original DataFrame.

Judging from this answer, selecting with .loc using a Boolean array will hand me back a copy, but then, if I try to change the copy, SettingWithCopyWarning gets in the way. Would this then be the correct way:

import numpy as np
import pandas as pd
d1 = pd.DataFrame(np.random.randn(10, 5), columns=['a', 'b', 'c', 'd', 'e'])
# create a new dataframe from the sliced copy
d2 = pd.DataFrame(d1.loc[d1.a > 1, :])
# do stuff with d2, keep d1 unchanged
like image 253
Pietro Marchesi Avatar asked Jul 07 '17 09:07

Pietro Marchesi


People also ask

Can I append a DataFrame to another DataFrame?

append() function is used to append rows of other dataframe to the end of the given dataframe, returning a new dataframe object. Columns not in the original dataframes are added as new columns and the new cells are populated with NaN value. Parameters: other : DataFrame or Series/dict-like object, or list of these.


1 Answers

You need copy with boolean indexing, new DataFrame constructor is not necessary:

d2 = d1[d1.a > 1].copy()

Explanation of warning:

If you modify values in d2 later you will find that the modifications do not propagate back to the original data (d1), and that Pandas does warning.

like image 174
jezrael Avatar answered Oct 02 '22 15:10

jezrael