When I type ls
I get:
aedes_aegypti_upstream_dremeready_all_simpleMasked_random.fasta
anopheles_albimanus_upstream_dremeready_all_simpleMasked_random.fasta
anopheles_arabiensis_upstream_dremeready_all_simpleMasked_random.fasta
anopheles_stephensi_upstream_dremeready_all_simpleMasked_random.fasta
culex_quinquefasciatus_upstream_dremeready_all_simpleMasked_random.fasta
I want to pipe this into cut (or via some alternative way) so that I only get:
aedes_aegypti
anopheles_albimanus
anopheles_arabiensis
anopheles_stephensi
culex_quinquefasciatus
If cut would accept a string (multiple characters) as it's delimiter then I could use:
cut -d "_upstream_" -f1
But that is not permitted as cut only takes single characters as delimiters.
awk
does allow a string as delimiter:
$ awk -F"_upstream_" '{print $1}' file
aedes_aegypti
anopheles_albimanus
anopheles_arabiensis
anopheles_stephensi
culex_quinquefasciatus
drosophila_melanogaster
Note for the given input you can also use cut
with _
as delimiter and print first two records:
$ cut -d'_' -f-2 file
aedes_aegypti
anopheles_albimanus
anopheles_arabiensis
anopheles_stephensi
culex_quinquefasciatus
drosophila_melanogaster
sed
and grep
can also make it. For example, this grep
uses a look-ahead to print everything from the beginning of the line until you find _upstream
:
$ grep -Po '^\w*(?=_upstream)' file
aedes_aegypti
anopheles_albimanus
anopheles_arabiensis
anopheles_stephensi
culex_quinquefasciatus
drosophila_melanogaster
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With