Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Organising Data

I have data file which looks like this:

chr1 762440 762981 SAMD11 
chr1 858932 859148 KLHL17 SAMD11 NOC2L 
chr1 859786 860145 KLHL17 SAMD11 NOC2L
chr1 890663 891747 KLHL17 NOC2L  SAMD11  HES4 

I want to is to arrange all the names one below the other with the values in first three column.

Something like this

chr1 762440 762981 SAMD11 
chr1 858932 859148 KLHL17
chr1 858932 859148 SAMD11 
chr1 858932 859148 NOC2L 
chr1 859786 860145 KLHL17 
chr1 859786 860145 SAMD11 
chr1 859786 860145 NOC2L

This output is for the first three lines but is desired for the entire set.

The number of names in each line are not fixed, please keep that point in mind (it can be 1 or 5 or 10 or 20 names)

What I thought

use sed -i .bak to place the names one below the other along with the value in first three columns.

But in the end it became overly complicated.

Could you please think of a simpler way to get around this?

Thank you

like image 366
Angelo Avatar asked Feb 20 '26 09:02

Angelo


2 Answers

Using awk

awk '{for (i=4;i<=NF;i++) print $1,$2,$3,$i}' file
chr1 762440 762981 SAMD11
chr1 858932 859148 KLHL17
chr1 858932 859148 SAMD11
chr1 858932 859148 NOC2L
chr1 859786 860145 KLHL17
chr1 859786 860145 SAMD11
chr1 859786 860145 NOC2L
chr1 890663 891747 KLHL17
chr1 890663 891747 NOC2L
chr1 890663 891747 SAMD11
chr1 890663 891747 HES4
like image 200
Jotne Avatar answered Feb 22 '26 22:02

Jotne


Here's how I'd do it in Perl:

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;

while (<DATA>) {
  chomp;
  my @line = split;
  for my $field (@line[3 .. $#line]) {
    say "@line[0 .. 2] $field";
  }
}

__END__
chr1 762440 762981 SAMD11 
chr1 858932 859148 KLHL17 SAMD11 NOC2L 
chr1 859786 860145 KLHL17 SAMD11 NOC2L
chr1 890663 891747 KLHL17 NOC2L  SAMD11  HES4 
like image 22
Dave Cross Avatar answered Feb 22 '26 21:02

Dave Cross



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!