Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting no. of Delimiter in a row in a File in Unix

I have a file 'records.txt' which contains over 200,000 records.

Each record is on a separate line and has multiple fields separated by a delimiter '|'.

Each row should have 35 fields, but the problem is one of these rows has <>35 fields, i.e. <>35 '|' characters.

Can someone please suggest a way in Unix, by which I can identify the row. (Like getting count of '|' characters in each row in the file)

like image 824
M.N Avatar asked Jan 14 '09 09:01

M.N


3 Answers

Try this:

awk -F '|'  'NF != 35 {print NR, $0} ' your_filefile
like image 70
Martin Wickman Avatar answered Dec 14 '22 10:12

Martin Wickman


This small perl script should do it:

cat records.txt | perl -ne '$t = $_; $t =~ s/[^\|]//g; print unless length($t) == 35;'

This works by removing all the characters except the |, then counting what is left.

like image 41
Greg Hewgill Avatar answered Dec 14 '22 10:12

Greg Hewgill


Greg's way with bash stuff, for the bash friends out there :)

while read n; do [ `echo $n | tr -cd '|' | wc -c` != 35 ] && echo $n; done < records.txt
like image 44
Johannes Schaub - litb Avatar answered Dec 14 '22 11:12

Johannes Schaub - litb