I have a df with 32 columns
df.shape
(568285, 32)
I am trying to rearrange the columns in a specific way, and drop the first column using iloc
df = df.iloc[:,[31,[1:23],24,25,26,28,27,29,30]]
^
SyntaxError: invalid syntax
is this the right way to do it?
You could use the np.r_
indexer.
class RClass(AxisConcatenator) | Translates slice objects to concatenation along the first axis. | | This is a simple way to build up arrays quickly. There are two use cases.
df = df.iloc[:, np.r_[31, 1:23, 24, 25, 26, 28, 27, 29, 30]]
df
0 1 2 3 4 5 6 7 8 9 ... 40 \
A 33.0 44.0 68.0 31.0 NaN 87.0 66.0 NaN 72.0 33.0 ... 71.0
B NaN NaN 77.0 98.0 NaN 48.0 91.0 43.0 NaN 89.0 ... 38.0
C 45.0 55.0 NaN 72.0 61.0 87.0 NaN 99.0 96.0 75.0 ... 83.0
D NaN NaN NaN 58.0 NaN 97.0 64.0 49.0 52.0 45.0 ... 63.0
41 42 43 44 45 46 47 48 49
A NaN 87.0 31.0 50.0 48.0 73.0 NaN NaN 81.0
B 79.0 47.0 51.0 99.0 59.0 NaN 72.0 48.0 NaN
C 93.0 NaN 95.0 97.0 52.0 99.0 71.0 53.0 69.0
D NaN 41.0 NaN NaN 55.0 90.0 NaN NaN 92.0
out = df.iloc[:, np.r_[31, 1:23, 24, 25, 26, 28, 27, 29, 30]]
out
31 1 2 3 4 5 6 7 8 9 ... 20 \
A 99.0 44.0 68.0 31.0 NaN 87.0 66.0 NaN 72.0 33.0 ... 66.0
B 42.0 NaN 77.0 98.0 NaN 48.0 91.0 43.0 NaN 89.0 ... NaN
C 77.0 55.0 NaN 72.0 61.0 87.0 NaN 99.0 96.0 75.0 ... 76.0
D 95.0 NaN NaN 58.0 NaN 97.0 64.0 49.0 52.0 45.0 ... 71.0
21 22 24 25 26 28 27 29 30
A NaN 40.0 66.0 87.0 97.0 68.0 NaN 68.0 NaN
B 95.0 NaN 47.0 79.0 47.0 NaN 83.0 81.0 57.0
C NaN 75.0 46.0 84.0 NaN 50.0 41.0 38.0 52.0
D NaN 74.0 41.0 55.0 60.0 NaN NaN 84.0 NaN
Here's a custom solution using explicit indexing:
Side note, np.r_
wasn't working for me, which is why I built this solution.
import numpy as np
import pandas as pd
# Make a sample df of 1_000 rows & 100 cols
data = np.zeros(shape=(1_000,100))
df = pd.DataFrame(data)
# Create a custom function for indexing
def all_nums_in_range(*tuple_pairs, len_df):
"""
Input pairs of tuples for index slicing
Include `len_df` to ensure length of array matches indexed df
"""
# Create an array with values to use as an index
num_range = np.zeros(shape=(len_df,), dtype=bool)
# Update
for (start, end) in tuple_pairs:
num_range[start:end] = True
return num_range
# Now apply
num_range = all_nums_in_range((0,50), (75, 80), len_df=100)
df.iloc[:, num_range]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With