Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unix - Need to cut a file which has multiple blanks as delimiter - awk or cut?

I need to get the records from a text file in Unix. The delimiter is multiple blanks. For example:

2U2133   1239  
1290fsdsf   3234

From this, I need to extract

1239  
3234

The delimiter for all records will be always 3 blanks.

I need to do this in an unix script(.scr) and write the output to another file or use it as an input to a do-while loop. I tried the below:

while read readline  
do  
        read_int=`echo "$readline"`  
        cnt_exc=`grep "$read_int" ${Directory path}/file1.txt| wc -l`  
if [ $cnt_exc -gt 0 ]  
then  
  int_1=0  
else  
  int_2=0  
fi  
done < awk -F'  ' '{ print $2 }' ${Directoty path}/test_file.txt  

test_file.txt is the input file and file1.txt is a lookup file. But the above way is not working and giving me syntax errors near awk -F

I tried writing the output to a file. The following worked in command line:

more test_file.txt | awk -F'   ' '{ print $2 }' > output.txt

This is working and writing the records to output.txt in command line. But the same command does not work in the unix script (It is a .scr file)

Please let me know where I am going wrong and how I can resolve this.

Thanks,
Visakh

like image 825
visakh Avatar asked Dec 06 '10 14:12

visakh


2 Answers

The job of replacing multiple delimiters with just one is left to tr:

cat <file_name> | tr -s ' ' | cut -d ' ' -f 2

tr translates or deletes characters, and is perfectly suited to prepare your data for cut to work properly.

The manual states:

-s, --squeeze-repeats
          replace each sequence  of  a  repeated  character  that  is
          listed  in the last specified SET, with a single occurrence
          of that character
like image 171
wlf Avatar answered Sep 20 '22 05:09

wlf


This should have been a comment, but since I cannot comment yet, I am adding this here. This is from an excellent answer here: https://stackoverflow.com/a/4483833/3138875

tr -s ' ' <text.txt | cut -d ' ' -f4

tr -s '<character>' squeezes multiple repeated instances of <character> into one.

like image 25
Gavin Avatar answered Sep 18 '22 05:09

Gavin