I have a DataFrame with columns <code>[A, B, C, D, E, F, G, H]</code>. An index has been made with columns <code>[D, G, H]</code>: <pre class="prettyprint"><code>>>> print(dgh_columns) Index(['D', 'G', 'H'], dtype='object') </code></pre> How can I retrieve the original DataFrame without the columns <code>D, G, H</code> ? Is there an index subset operation? Ideally, this would be: <pre class="prettyprint"><code>df[df.index - dgh_columns] </code></pre> But this doesn't seem to work

I think you can use <code>Index.difference</code>: <pre class="prettyprint"><code>df[df.columns.difference(dgh_columns)] </code></pre> Sample: <pre class="prettyprint"><code>df = pd.DataFrame({'A':[1,2,3], 'B':[4,5,6], 'C':[7,8,9], 'D':[1,3,5], 'E':[7,8,9], 'F':[1,3,5], 'G':[5,3,6], 'H':[7,4,3]}) print (df) A B C D E F G H 0 1 4 7 1 7 1 5 7 1 2 5 8 3 8 3 3 4 2 3 6 9 5 9 5 6 3 dgh_columns = pd.Index(['D', 'G', 'H']) print (df[df.columns.difference(dgh_columns)]) A B C E F 0 1 4 7 7 1 1 2 5 8 8 3 2 3 6 9 9 5 </code></pre> Numpy solution with <code>numpy.setxor1d</code> or <code>numpy.setdiff1d</code>: <pre class="prettyprint"><code>dgh_columns = pd.Index(['D', 'G', 'H']) print (df[np.setxor1d(df.columns, dgh_columns)]) A B C E F 0 1 4 7 7 1 1 2 5 8 8 3 2 3 6 9 9 5 </code></pre> <hr> <pre class="prettyprint"><code>dgh_columns = pd.Index(['D', 'G', 'H']) print (df[np.setdiff1d(df.columns, dgh_columns)]) A B C E F 0 1 4 7 7 1 1 2 5 8 8 3 2 3 6 9 9 5 </code></pre>

Subsetting index from Pandas DataFrame

Tags:

pandas

dataframe

I have a DataFrame with columns [A, B, C, D, E, F, G, H].

An index has been made with columns [D, G, H]:

Click to copy

>>> print(dgh_columns)
Index(['D', 'G', 'H'], dtype='object')

How can I retrieve the original DataFrame without the columns D, G, H ?

Is there an index subset operation?

Ideally, this would be:

Click to copy

df[df.index - dgh_columns]

But this doesn't seem to work

800

asked Nov 07 '16 14:11

Jivan

1 Answers

I think you can use Index.difference:

Click to copy

df[df.columns.difference(dgh_columns)]

Sample:

Click to copy

df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9],
                   'D':[1,3,5],
                   'E':[7,8,9],
                   'F':[1,3,5],
                   'G':[5,3,6],
                   'H':[7,4,3]})

print (df)
   A  B  C  D  E  F  G  H
0  1  4  7  1  7  1  5  7
1  2  5  8  3  8  3  3  4
2  3  6  9  5  9  5  6  3

dgh_columns = pd.Index(['D', 'G', 'H'])
print (df[df.columns.difference(dgh_columns)])
   A  B  C  E  F
0  1  4  7  7  1
1  2  5  8  8  3
2  3  6  9  9  5

Numpy solution with numpy.setxor1d or numpy.setdiff1d:

Click to copy

dgh_columns = pd.Index(['D', 'G', 'H'])
print (df[np.setxor1d(df.columns, dgh_columns)])
   A  B  C  E  F
0  1  4  7  7  1
1  2  5  8  8  3
2  3  6  9  9  5

Click to copy

dgh_columns = pd.Index(['D', 'G', 'H'])
print (df[np.setdiff1d(df.columns, dgh_columns)])
   A  B  C  E  F
0  1  4  7  7  1
1  2  5  8  8  3
2  3  6  9  9  5

176

answered Jan 04 '23 00:01

jezrael

Related questions
                            
                                Pandas: increment datetime
                            
                                Select rows from a pandas dataframe where two columns match list of pairs
                            
                                Highlight specific points in matplotlib scatterplot
                            
                                Flatten nested pandas dataframe
                            
                                Pandas Flatten a dataframe to a single column
                            
                                Synchronizing code between jupyter/iPython notebook script and class methods
                            
                                How to write CSV files into XLSX using Python Pandas?
                            
                                Pandas .dropna() on specify attribute
                            
                                How to add custom annotations, from the dataframe, to a stacked bar chart?
                            
                                How to reset indexes when aggregating multiple columns in pandas
                            
                                pandas daily average, pandas.resample
                            
                                Pandas extract comment lines
                            
                                How to get the difference between two 24 hour times?
                            
                                Split DateTimeIndex data based on hour/minute/second
                            
                                Rearrange rows of pandas dataframe based on list and keeping the order
                            
                                is_max = s == s.max() | How should I read this?
                            
                                import all csv files in directory as pandas dfs and name them as csv filenames
                            
                                Dynamically add items from a file to a ComboBox
                            
                                Using column header and values from one dataframe to find weights in another dataframe
                            
                                Find row in pandas and update specific value

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Subsetting index from Pandas DataFrame

Tags:

pandas

dataframe

Jivan

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us