Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing Carriage Returns from Csv String

I am generating a csv string using pandas as such: df.to_csv(sep=',', encoding='utf-8', index=False, header=False).

I will load the csv in postgres as follows.

    COPY table_name FROM STDIN (FORMAT CSV,
    DELIMITER ',',
    HEADER FALSE,
    NULL '',
    ENCODING 'utf-8');

Occasionally this will fail when there is carriage return present in the csv, which I remove using csv_str.replace('\r', ''). However I am not sure if this approach might alter some existing data in the csv by any chance.

Is there a more reliable way to remove any carriage returns? I am fine with losing information about the carriage returns, but I want to preserve as much as possible any other data stored in the csv files.

like image 890
hangc Avatar asked Jan 25 '26 02:01

hangc


1 Answers

Rather than stripping the carriage returns from your CSV file, ensure that those fields that contain them are quoted. One way is to just quote all fields:

import csv
import pandas as pd

df.to_csv(sep=',', encoding='utf-8', index=False, header=False, quoting=csv.QUOTE_ALL)

Alternatively you can use quoting=csv.QUOTE_NONNUMERIC to quote only those fields likely to contain \r.

One other way is to set the line terminator to \r\n (or just \r) which will indirectly cause any field that contains \r to be quoted. This might be preferred because only those individual "cells" that require it are quoted:

df.to_csv(sep=',', encoding='utf-8', index=False, header=False, line_terminator='\r\n')
like image 146
mhawke Avatar answered Jan 26 '26 16:01

mhawke



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!