I have two files with same amount of rows and columns. Delimited with <code>;</code>. Example; file_a: <pre class="prettyprint"><code>1;1;1;1;1 2;2;2;2;2 3;3;3;3;3 4;4;4;4;4 </code></pre> file_b: <pre class="prettyprint"><code>A;A;A;A;A B;B;;;B ;;;; D;D;D;D;D </code></pre> Ignoring delimiters, line 3 is empty from <code>file_b</code>. So I want to remove line 3 from <code>file_a</code> as well, before command; <code>paste -d ';' file_a file_b</code>. in order to have an output like this: <pre class="prettyprint"><code>1;1;1;1;1;A;A;A;A;A 2;2;2;2;2;B;B;;;B 4;4;4;4;4;D;D;D;D;D </code></pre> Edit: Number of columns is 93 and same for each row and for both files, so both files have exactly the same matrix of rows and columns.

Since you mention that both files have same number of lines, <code>getline</code> would fit here: <pre class="prettyprint"><code>$ awk '(getline line < "f2")==1 && line ~ /[^;]/' f1 1;1;1;1;1 2;2;2;2;2 4;4;4;4;4 </code></pre> And you can do the <code>paste</code> functionality within <code>awk</code> as well: <pre class="prettyprint"><code>$ awk '(getline line < "f2")==1 && line ~ /[^;]/{print $0 ";" line}' f1 1;1;1;1;1;A;A;A;A;A 2;2;2;2;2;B;B;;;B 4;4;4;4;4;D;D;D;D;D </code></pre> The return value of <code>getline</code> is <code>1</code> if line was read successfully. <code>line ~ /[^;]</code> checks if the line contains any non <code>;</code> character. If both conditions are satisfied, you can then print the required results.

Remove lines from a file corresponding to blank lines of another file

Tags:

awk

paste

blank-line

I have two files with same amount of rows and columns. Delimited with ;. Example;

file_a:

1;1;1;1;1
2;2;2;2;2
3;3;3;3;3
4;4;4;4;4

file_b:

A;A;A;A;A
B;B;;;B
;;;;
D;D;D;D;D

Ignoring delimiters, line 3 is empty from file_b. So I want to remove line 3 from file_a as well, before command;

paste -d ';' file_a file_b.

in order to have an output like this:

1;1;1;1;1;A;A;A;A;A
2;2;2;2;2;B;B;;;B
4;4;4;4;4;D;D;D;D;D

Edit: Number of columns is 93 and same for each row and for both files, so both files have exactly the same matrix of rows and columns.

938

asked Sep 24 '20 07:09

Ahmet Said Akbulut

4 Answers

Could you please try following, written and tested with shown samples in GNU awk.

awk '
BEGIN{
  FS=OFS=";"
}
FNR==NR{
  arr[FNR]=$0
  next
}
!/^;+$/{
  print arr[FNR],$0
}
' file_a file_b

Explanation: Adding detailed explanation for above.

awk '                 ##Starting awk program from here.
BEGIN{                ##Starting BEGIN section from here.
  FS=OFS=";"          ##Setting field separator and output field separator as ; here.
}
FNR==NR{              ##Checking condition if FNR==NR which will be TRUE when file_a is being read.
  arr[FNR]=$0         ##Creating arr with index FNR and value is current line.
  next                ##next will skip all further statements from here.
}
!/^;+$/{              ##Checking condition if line NOT starting from ; till end then do following.
  print arr[FNR],$0   ##Printing arr with index of FNR and current line.
}
' file_a file_b       ##Mentioning Input_file names here.

112

answered Oct 20 '22 01:10

RavinderSingh13

Since you mention that both files have same number of lines, getline would fit here:

$ awk '(getline line < "f2")==1 && line ~ /[^;]/' f1
1;1;1;1;1
2;2;2;2;2
4;4;4;4;4

And you can do the paste functionality within awk as well:

$ awk '(getline line < "f2")==1 && line ~ /[^;]/{print $0 ";" line}' f1
1;1;1;1;1;A;A;A;A;A
2;2;2;2;2;B;B;;;B
4;4;4;4;4;D;D;D;D;D

The return value of getline is 1 if line was read successfully. line ~ /[^;] checks if the line contains any non ; character. If both conditions are satisfied, you can then print the required results.

answered Oct 20 '22 00:10

Sundeep

Basically a modification of @RavinderSingh13's solution but I only store the NR's of the empty records:

$ awk '
NR==FNR {            # process the b file
    if($0~/^;+$/)    # when empty record met
        a[NR]        # hash the record number NR
    next
}
!(FNR in a)          # print non-empty matches of a file
' fileb filea

Output:

1;1;1;1;1
2;2;2;2;2
4;4;4;4;4

answered Oct 20 '22 00:10

James Brown

Filtering after paste is easier. Assuming the format of the input lines to exclude is exactly as shown in the question, you can filter the output of paste with a grep pattern anchored to the end of the line. (5 empty fields at the end of the line)

paste -d ';' file_a file_b | grep -v ';;;;;$'

With the input files shown in the question, this prints exactly the requested output.

Edit:
To fulfill an additional requirement from a comment, the grep command can be modified to specify the number of semicolons corresponding to the number of empty columns. For different input files, simply change the number 5 accordingly.

paste -d ';' file_a file_b | grep -v ';\{5\}$'

If the number of columns is 93 as now specified in the question, the command would be

paste -d ';' file_a file_b | grep -v ';\{93\}$'

Edit2:
You can also get the required number of semicolons from the first line of file_b

SEMICOLONS=$(head -1 file_b | sed 's/[^;]*//g')
paste -d ';' file_a file_b | grep -v ";$SEMICOLONS"'$'

or combined to

paste -d ';' file_a file_b | grep -v ';'$(head -1 file_b | sed 's/[^;]*//g')'$'

answered Oct 20 '22 00:10

Bodo

Related questions
                            
                                Passing Variable to NR in AWK command not working
                            
                                Print many specific rows from a text file using an index file
                            
                                Print only if field is not empty
                            
                                Replace all non-alphanumeric characters in a string with an underscore
                            
                                Replace certain token with the content of a file (using a bash-script)
                            
                                Best way to parse this particular string using awk / sed?
                            
                                Obtain patterns in one file from another using ack or awk or better way than grep?
                            
                                Replacing a String Pattern with another sequence in unix
                            
                                to insert line breaks in a file whenever a comma is encountered-Shell script
                            
                                How can I read first n and last n lines from a file?
                            
                                Awk or Sed: Return lines between two instances of the same pattern
                            
                                awk split() function uses regular expression or exact string constant?
                            
                                Regular expression - replace all spaces in beginning of line with periods
                            
                                Output groups one per line, with group name and group ID
                            
                                Shell: insert a blank/new line two lines above pattern
                            
                                Processing text with elisp
                            
                                how can i make awk process the BEGIN block for each file it parses?
                            
                                script for changing prefix of filename in bash
                            
                                Read certain key from certain section of ini file (sed/awk ?)
                            
                                how to remove last comma from line in bash using "sed or awk"

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Remove lines from a file corresponding to blank lines of another file

Tags:

awk

paste

blank-line

Ahmet Said Akbulut

People also ask

4 Answers

RavinderSingh13

Sundeep

James Brown

Bodo

Recent Activity

Donate For Us