I'm having some issues with parsing CSV data with quotes. My main problem is with quotes within a field. In the following example lines 1 - 4 work correctly but 5,6 and 7 don't.
COLLOQ_TYPE,COLLOQ_NAME,COLLOQ_CODE,XDATA
S,"BELT,FAN",003541547,
S,"BELT V,FAN",000324244,
S,SHROUD SPRING SCREW,000868265,
S,"D" REL VALVE ASSY,000771881,
S,"YBELT,"V"",000323030,
S,"YBELT,'V'",000322933,
I'd like to avoid Text::CSV as it isn't installed on the target server. Realising that CSV's are are more complicated than they look I'm using a recipe from the Perl Cookbook.
sub parse_csv {
my $text = shift; #record containg CSVs
my @columns = ();
push(@columns ,$+) while $text =~ m{
# The first part groups the phrase inside quotes
"([^\"\\]*(?:\\.[^\"\\]*)*)",?
| ([^,]+),?
| ,
}gx;
push(@columns ,undef) if substr($text, -1,1) eq ',';
return @columns ; # list of vars that was comma separated.
}
Does anyone have a suggestion for improving the regex to handle the above cases?
Yes. You can import double quotation marks using CSV files and import maps by escaping the double quotation marks. To escape the double quotation marks, enclose them within another double quotation mark.
So quote characters are used in CSV files when the text within a field also includes a comma and could be confused as being the reserved comma delimiter for the next field. Quote characters indicate the start and end of a block of text where any comma characters can be ignored.
Programs store CSV files as simple text characters; a comma separates each data element, such as a name, phone number or dollar amount, from its neighbors. Because of CSV's simple format, you can parse these files with practically any programming language.
There's no reason you couldn't download a copy of Text::CSV, or any other non-XS based implementation of a CSV parser and install it in your local directory, or in a lib/ sub directory of your project so its installed along with your projects rollout.
If you can't store text files in your project, then I'm wondering how it is you are coding your project.
http://novosial.org/perl/life-with-cpan/non-root/
Should be a good guide on how to get these into a working state locally.
Please consider this before trying to write your own CSV implementation.
Text::CSV is over a hundred lines of code, including fixed bugs and edge cases, and re-writing this from scratch will just make you learn how awful CSV can be the hard way.
note: I learnt this the hard way. Took me a full day to get a working CSV parser in PHP before I discovered an inbuilt one had been added in a later version. It really is something awful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With