I have a DataFrame with 200 indices. I want to delete all the rows belonging to other indices except those belonging to certain indices like 128, 133, 140, 143, 199.
Previously, I dropped all the rows belonging to the indices 128, 133, 140, 143, 199, and it had worked fine. My code was
dataset_drop = dataset.drop(index = [128, 133, 140, 143, 199])
Now, I am trying to do the other way round. I want to keep the rows belonging to the indices 128, 133, 140, 143, 199 and delete the others.
What I tried doing:
dropped_data = dataset.drop(index != [128, 133, 140, 143, 199])
When I do this, I get an error saying
NameError: name 'index' is not defined
Can anyone tell what is it that I am doing wrong?
To explain the reason for your exception, the expression
index != [128, 133, 140, 143, 199]
Is evaluated as a conditional expression, rather than treating index as a keyword argument. Python searches for the variable index to compare against the list. Since index is not defined, you see a NameError.
Use Index.difference to fix your drop solution:
dataset.drop(index=df.index.difference([128, 133, 140, 143, 199]))
Or, even more idiomatically, you should use loc to select if you have positive labels.
dataset.loc[[128, 133, 140, 143, 199]]
# If they are indexes,
# dataset.iloc[[128, 133, 140, 143, 199]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With