Good Afternoon, i am trying to count the number of times the letters A C T G occur in DNA sequence using perl6.i have tried other ways i am just trying to get it done in another way. Here are some of the code i came up with
use v6;
my $default-input = "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC";
sub MAIN(Str $input = $default-input)
{
say "{bag($input.comb)<A C G T>}";
}
use v6;
my $default-input = "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC";
sub MAIN($input = $default-input)
{
"{<A C G T>.map({ +$input.comb(/$_/) })}".say;
Sample Dataset
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC
multi sub MAIN ( \DNA ) {
my Int %bag = A => 0, C => 0, G => 0, T => 0;
# doesn't keep the whole thing in memory
# like .comb.Bag would have
for DNA.comb {
%bag{$_}++
}
.say for %bag<A C G T> :p;
}
multi sub MAIN ( 'example' ){
samewith "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC"
}
multi sub MAIN ( Bool :STDIN($)! ){
samewith $*IN
}
multi sub MAIN ( Str :filename(:$file)! where .IO.f ){
samewith $file.IO
}
~$ ./test.p6
Usage:
./test.p6 <DNA>
./test.p6 example
./test.p6 --STDIN
./test.p6 --filename|--file=<Str>
~$ ./test.p6 example
A => 20
C => 12
G => 17
T => 21
~$ ./test.p6 --STDIN < test.in
A => 20
C => 12
G => 17
T => 21
~$ ./test.p6 --file=test.in
A => 20
C => 12
G => 17
T => 21
Another way is to use the BioInfo modules I'm working on which have a coercion to Bag already for you :)
use v6;
use BioInfo;
my @sequences = `
>seqid
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC
`;
for @sequences -> $seq {
say $seq.Bag;
}
In the above code you are importing a special bioinformatics Slang which understands that string literals between `` are FASTA literals. DNA/RNA/Amino-acids are automatically detected and you get a specific class for that. The object has its own .Bag that does what you want. Other than my own modules there is also the BioPerl6 project.
If you want to read from file then the following should work for you:
use v6;
use BioInfo::Parser::FASTA;
use BioInfo::IO::FileParser;
#Spawn an IO thread that parses the file and creates BioInfo::Seq objects on .get
my $seq_file = BioInfo::IO::FileParser.new(file => 'myseqs.fa', parser => BioInfo::Parser::FASTA);
#Print the residue counts per file
while my $seq = $seq_file.get() {
say $seq.Bag;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With