How can I delete lines (rows) and columns in a text file that contain all the zeros. For example, I have a file:
1 0 1 0 1
0 0 0 0 0
1 1 1 0 1
0 1 1 0 1
1 1 0 0 0
0 0 0 0 0
0 0 1 0 1
I want to delete 2nd and 4th line and also the 2nd column. The output should look like:
1 0 1 1
1 1 1 1
0 1 1 1
1 1 0 0
0 0 1 1
I can do this using sed and egrep
sed '/0 0 0 0/d' or egrep -v '^(0 0 0 0 )$'
for lines with zeros but that would too inconvenient for files with thousands of columns. I have no idea how can I remove the column with all zeros, 2nd column here.
(1) Select the Entire row option in the Selection type section. (2) Select Equals in the first Specific type drop-down list, then enter number 0 into the text box. (3) Click the OK button.
For example, if we have a data frame called df then we can remove rows that contain at least one 0 can be done by using the command df[apply(df,1, function(x) all(x!= 0)),].
To remove the rows of 0 , you can: sum the absolute value of each rows (to avoid having a zero sum from a mix of negative and positive numbers), which gives you a column vector of the row sums. keep the index of each line where the sum is non-zero.
Perl solution. It keeps all the non-zero lines in memory to be printed at the end, because it cannot tell what columns will be non-zero before it processes the whole file. If you get Out of memory
, you may only store the numbers of the lines you want to output, and process the file again while printing the lines.
#!/usr/bin/perl
use warnings;
use strict;
my @nonzero; # What columns where not zero.
my @output; # The whole table for output.
while (<>) {
next unless /1/;
my @col = split;
$col[$_] and $nonzero[$_] ||= 1 for 0 .. $#col;
push @output, \@col;
}
my @columns = grep $nonzero[$_], 0 .. $#nonzero; # What columns to output.
for my $line (@output) {
print "@{$line}[@columns]\n";
}
Rather than storing lines in memory, this version scans the file twice: Once to find the "zero columns", and again to find the "zero rows" and perform the output:
awk '
NR==1 {for (i=1; i<=NF; i++) if ($i == 0) zerocol[i]=1; next}
NR==FNR {for (idx in zerocol) if ($idx) delete zerocol[idx]; next}
{p=0; for (i=1; i<=NF; i++) if ($i) {p++; break}}
p {for (i=1; i<=NF; i++) if (!(i in zerocol)) printf "%s%s", $i, OFS; print ""}
' file file
1 0 1 1
1 1 1 1
0 1 1 1
1 1 0 0
0 0 1 1
A ruby program: ruby has a nice array method transpose
#!/usr/bin/ruby
def remove_zeros(m)
m.select {|row| row.detect {|elem| elem != 0}}
end
matrix = File.readlines(ARGV[0]).map {|line| line.split.map {|elem| elem.to_i}}
# remove zero rows
matrix = remove_zeros(matrix)
# remove zero rows from the transposed matrix, then re-transpose the result
matrix = remove_zeros(matrix.transpose).transpose
matrix.each {|row| puts row.join(" ")}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With