I have a file and its name looks like:
12U12345._L001_R1_001.fastq.gz
I want to assign to a variable just the 12U12345
part.
So far I have:
variable=`basename $fastq | sed {s'/_S[0-9]*_L001_R1_001.fastq.gz//'}`
Note: $fastq
is a variable with the full path to the file in it.
This solution currently returns the full file name, any ideas how to get this right?
Just use the built-in parameter expansion provided by the shell, instead of spawning a separate process
fastq="12U12345._L001_R1_001.fastq.gz"
printf '%s\n' "${fastq%%.*}"
12U12345
or use printf()
itself to store to a new variable in one-shot
printf -v numericPart '%s' "${fastq%%.*}"
printf '%s\n' "${numericPart}"
Also bash
has a built-in regular expression comparison operator, represented by =~
using which you could do
fastq="12U12345._L001_R1_001.fastq.gz"
regex='^([[:alnum:]]+)\.(.*)'
if [[ $fastq =~ $regex ]]; then
numericPart="${BASH_REMATCH[1]}"
printf '%s\n' "${numericPart}"
fi
You could use cut
:
$> fastq="/path/to/12U12345._L001_R1_001.fastq.gz"
$> variable=$(basename "$fastq" | cut -d '.' -f 1)
$> echo "$variable"
12U12345
Also, please note that:
It's better to wrap your variable inside quotes. Otherwise you command won't work with filenames that contain space(s).
You should use $()
instead of the backticks.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With