I'm having trouble deleting rows in text file that contains a string in one column. My code so far is not able to delete the row, but it's able to read the text file and save it as a CSV file into separate columns. But the rows are not getting deleted.
This is what the values in that column looks like:
Ship To or Bill To
------------------
3000000092-BILL_TO
3000000092-SHIP_TO
3000004000_SHIP_TO-INAC-EIM
And there are 20 more columns and 50,000k plus rows. So essentially I'm trying to delete all the rows that contain strings 'INAC'
or 'EIM'
.
import csv
my_file_name = "NVG.txt"
cleaned_file = "cleanNVG.csv"
remove_words = ['INAC','EIM']
with open(my_file_name, 'r', newline='') as infile, \
open(cleaned_file, 'w',newline='') as outfile:
writer = csv.writer(outfile)
for line in csv.reader(infile, delimiter='|'):
if not any(remove_word in line for remove_word in remove_words):
writer.writerow(line)
The problem here is that the csv.reader
object returns the rows of the file as lists of individual column values, so the "in" test is checking to see whether any of the individual values in that list is equal to a remove_word
.
A quick fix would be to try
if not any(remove_word in element
for element in line
for remove_word in remove_words):
because this will be true if any field in the line contains any of the remove_words
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With