I have a file.dat which looks like:
id       | user_id | venue_id | latitude  | longitude | created_at
---------+---------+----------+-----------+-----------+-----------------
984301   |2041916  |5222      |           |           |2012-04-21 17:39:01
984222   |15824    |5222      |38.8951118 |-77.0363658|2012-04-21 17:43:47
984315   |1764391  |5222      |           |           |2012-04-21 17:37:18
984234   |44652    |5222      |33.800745  |-84.41052  | 2012-04-21 17:43:43
I need to get csv file with deleted empty latitude and longtitude rows, like:
id,user_id,venue_id,latitude,longitude,created_at
984222,15824,5222,38.8951118,-77.0363658,2012-04-21T17:43:47
984234,44652,5222,33.800745,-84.41052,2012-04-21T17:43:43
984291,105054,5222,45.5234515,-122.6762071,2012-04-21T17:39:22
I try to do that, using next code:
with open('file.dat', 'r') as input_file:
    lines = input_file.readlines()
    newLines = []
    for line in lines:
        newLine = line.strip('|').split()
        newLines.append(newLine)
with open('file.csv', 'w') as output_file:
    file_writer = csv.writer(output_file)
    file_writer.writerows(newLines)
But all the same I get a csv file with "|" symbols and empty latitude/longtitude rows. Where is mistake? In general I need to use resulting csv-file in DateFrame, so maybe there is some way to reduce number of actions.
str.strip() removes leading and trailing characters from a string.
You want to split the lines on "|", then strip each element of the resulting list:
import csv
with open('file.dat') as dat_file, open('file.csv', 'w') as csv_file:
    csv_writer = csv.writer(csv_file)
    for line in dat_file:
        row = [field.strip() for field in line.split('|')]
        if len(row) == 6 and row[3] and row[4]:
            csv_writer.writerow(row)
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With