I have some output like this from ls -alth
:
drwxr-xr-x 5 root admin 170B Aug 3 2016 ..
drwxr-xr-x 5 root admin 70B Aug 3 2016 ..
drwxr-xr-x 5 root admin 3B Aug 3 2016 ..
drwxr-xr-x 5 root admin 9M Aug 3 2016 ..
Now, I want to parse out the 170B
part, which is obviously the size in human readable format. I wanted to do this using cut
or sed
, because I don't want to use tools that are any more complicated/difficult to use than necessary.
Ideally I want it to be robust enough to handle the B
, M
or K
suffix that comes with the size, and multiply accordingly by 1
, 1000000
and 1000
accordingly. I haven't found a good way to do that, though.
I've tried a few things without really knowing the best approach:
ls -alth | cut -f 5 -d \s+
I was hoping that would work because I'd be able to just delimit it on one or more spaces.
But that doesn't work. How do I supply cut
with a regex delimiter? or is there an easier way to extract only the size of the file from ls -alth
?
I'm using CentOS6.4
The cut command is a command-line utility that allows you to cut out sections of a specified file or piped data and print the result to standard output. The command cuts parts of a line by field, delimiter, byte position, and character.
Regular expressions are used by several different Unix commands, including ed, sed, awk, grep, and to a more limited extent, vi.
Regexps are most commonly used with the Linux commands:- grep, sed, tr, vi. The following are some basic regular expressions: Sr.
This answer tackles the question as asked, but consider George Vasiliou's helpful find
solution as a potentially superior alternative.
cut
only supports a single, literal character as the delimiter (-d
), so it isn't the right tool to use.
For extracting tokens (fields) that are separated with a variable amount of whitespace per line, awk
is the best tool, so the solution proposed by George Vasiliou is the simplest one:ls -alth | awk '{print $5}'
extracts the 5th whitespace-separated field ($5
), which is the size.
Rather than use -h
first and then reconvert the human-readable suffixes (such as B
, M
, and G
) back to the mere byte counts (incidentally, the multipliers must be multiples of 1024
, not 1000
), simply omit -h
from the ls
command, which outputs the raw byte counts by default:ls -alt | awk '{print $5}'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With