A function to return human readable size from bytes size:
>>> human_readable(2048)
'2 kilobytes'
>>>
How to do this?
Using ls to Show File Size in Human-Readable Format Using the -h flag shows the total size of files and directories and the individual size of each file and directory in a human-readable format. You can also specify the block size for displaying the file size. By default, the file size is in bytes.
It's an object which will store the file size in bytes as an integer. But when you try to print the object, you automatically get a human readable version.
One such library is hurry.filesize.
>>> from hurry.filesize import alternative
>>> size(1, system=alternative)
'1 byte'
>>> size(10, system=alternative)
'10 bytes'
>>> size(1024, system=alternative)
'1 KB'
A library that has all the functionality that it seems you're looking for is humanize
. humanize.naturalsize()
seems to do everything you're looking for.
Example code (python 3.10
)
import humanize
disk_sizes_list = [1, 100, 999, 1000,1024, 2000,2048, 3000, 9999, 10000, 2048000000, 9990000000, 9000000000000000000000]
for size in disk_sizes_list:
natural_size = humanize.naturalsize(size)
binary_size = humanize.naturalsize(size, binary=True)
print(f" {natural_size} \t| {binary_size}\t|{size}")
Output
1 Byte | 1 Byte |1
100 Bytes | 100 Bytes |100
999 Bytes | 999 Bytes |999
1.0 kB | 1000 Bytes |1000
1.0 kB | 1.0 KiB |1024
2.0 kB | 2.0 KiB |2000
2.0 kB | 2.0 KiB |2048
3.0 kB | 2.9 KiB |3000
10.0 kB | 9.8 KiB |9999
10.0 kB | 9.8 KiB |10000
2.0 GB | 1.9 GiB |2048000000
10.0 GB | 9.3 GiB |9990000000
9.0 ZB | 7.6 ZiB |9000000000000000000000
There's always got to be one of those guys. Well today it's me. Here's a one-liner -- or two lines if you count the function signature.
def human_size(bytes, units=[' bytes','KB','MB','GB','TB', 'PB', 'EB']):
""" Returns a human readable string representation of bytes """
return str(bytes) + units[0] if bytes < 1024 else human_size(bytes>>10, units[1:])
>>> human_size(123)
123 bytes
>>> human_size(123456789)
117GB
If you need sizes bigger than an Exabyte, it's a little bit more gnarly:
def human_size(bytes, units=[' bytes','KB','MB','GB','TB', 'PB', 'EB']):
return str(bytes) + units[0] if bytes < 1024 else human_size(bytes>>10, units[1:]) if units[1:] else f'{bytes>>10}ZB'
Here's my version. It does not use a for-loop. It has constant complexity, O(1), and is in theory more efficient than the answers here that use a for-loop.
from math import log
unit_list = zip(['bytes', 'kB', 'MB', 'GB', 'TB', 'PB'], [0, 0, 1, 2, 2, 2])
def sizeof_fmt(num):
"""Human friendly file size"""
if num > 1:
exponent = min(int(log(num, 1024)), len(unit_list) - 1)
quotient = float(num) / 1024**exponent
unit, num_decimals = unit_list[exponent]
format_string = '{:.%sf} {}' % (num_decimals)
return format_string.format(quotient, unit)
if num == 0:
return '0 bytes'
if num == 1:
return '1 byte'
To make it more clear what is going on, we can omit the code for the string formatting. Here are the lines that actually do the work:
exponent = int(log(num, 1024))
quotient = num / 1024**exponent
unit_list[exponent]
I recently came up with a version that avoids loops, using log2
to determine the size order which doubles as a shift and an index into the suffix list:
from math import log2
_suffixes = ['bytes', 'KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']
def file_size(size):
# determine binary order in steps of size 10
# (coerce to int, // still returns a float)
order = int(log2(size) / 10) if size else 0
# format file size
# (.4g results in rounded numbers for exact matches and max 3 decimals,
# should never resort to exponent values)
return '{:.4g} {}'.format(size / (1 << (order * 10)), _suffixes[order])
Could well be considered unpythonic for its readability, though.
If you're using Django installed you can also try filesizeformat:
from django.template.defaultfilters import filesizeformat
filesizeformat(1073741824)
=>
"1.0 GB"
You should use "humanize".
>>> humanize.naturalsize(1000000)
'1.0 MB'
>>> humanize.naturalsize(1000000, binary=True)
'976.6 KiB'
>>> humanize.naturalsize(1000000, gnu=True)
'976.6K'
Reference:
https://pypi.org/project/humanize/
Using either powers of 1000 or kibibytes would be more standard-friendly:
def sizeof_fmt(num, use_kibibyte=True):
base, suffix = [(1000.,'B'),(1024.,'iB')][use_kibibyte]
for x in ['B'] + map(lambda x: x+suffix, list('kMGTP')):
if -base < num < base:
return "%3.1f %s" % (num, x)
num /= base
return "%3.1f %s" % (num, x)
P.S. Never trust a library that prints thousands with the K (uppercase) suffix :)
The HumanFriendly project helps with this.
import humanfriendly
humanfriendly.format_size(1024)
The above code will give 1KB as answer.
Examples can be found here.
Riffing on the snippet provided as an alternative to hurry.filesize(), here is a snippet that gives varying precision numbers based on the prefix used. It isn't as terse as some snippets, but I like the results.
def human_size(size_bytes):
"""
format a size in bytes into a 'human' file size, e.g. bytes, KB, MB, GB, TB, PB
Note that bytes/KB will be reported in whole numbers but MB and above will have greater precision
e.g. 1 byte, 43 bytes, 443 KB, 4.3 MB, 4.43 GB, etc
"""
if size_bytes == 1:
# because I really hate unnecessary plurals
return "1 byte"
suffixes_table = [('bytes',0),('KB',0),('MB',1),('GB',2),('TB',2), ('PB',2)]
num = float(size_bytes)
for suffix, precision in suffixes_table:
if num < 1024.0:
break
num /= 1024.0
if precision == 0:
formatted_size = "%d" % num
else:
formatted_size = str(round(num, ndigits=precision))
return "%s %s" % (formatted_size, suffix)
This will do what you need in almost any situation, is customizable with optional arguments, and as you can see, is pretty much self-documenting:
from math import log
def pretty_size(n,pow=0,b=1024,u='B',pre=['']+[p+'i'for p in'KMGTPEZY']):
pow,n=min(int(log(max(n*b**pow,1),b)),len(pre)-1),n*b**pow
return "%%.%if %%s%%s"%abs(pow%(-pow-1))%(n/b**float(pow),pre[pow],u)
Example output:
>>> pretty_size(42)
'42 B'
>>> pretty_size(2015)
'2.0 KiB'
>>> pretty_size(987654321)
'941.9 MiB'
>>> pretty_size(9876543210)
'9.2 GiB'
>>> pretty_size(0.5,pow=1)
'512 B'
>>> pretty_size(0)
'0 B'
Advanced customizations:
>>> pretty_size(987654321,b=1000,u='bytes',pre=['','kilo','mega','giga'])
'987.7 megabytes'
>>> pretty_size(9876543210,b=1000,u='bytes',pre=['','kilo','mega','giga'])
'9.9 gigabytes'
This code is both Python 2 and Python 3 compatible. PEP8 compliance is an exercise for the reader. Remember, it's the output that's pretty.
Update:
If you need thousands commas, just apply the obvious extension:
def prettier_size(n,pow=0,b=1024,u='B',pre=['']+[p+'i'for p in'KMGTPEZY']):
r,f=min(int(log(max(n*b**pow,1),b)),len(pre)-1),'{:,.%if} %s%s'
return (f%(abs(r%(-r-1)),pre[r],u)).format(n*b**pow/b**float(r))
For example:
>>> pretty_units(987654321098765432109876543210)
'816,968.5 YiB'
I like the fixed precision of senderle's decimal version, so here's a sort of hybrid of that with joctee's answer above (did you know you could take logs with non-integer bases?):
from math import log
def human_readable_bytes(x):
# hybrid of https://stackoverflow.com/a/10171475/2595465
# with https://stackoverflow.com/a/5414105/2595465
if x == 0: return '0'
magnitude = int(log(abs(x),10.24))
if magnitude > 16:
format_str = '%iP'
denominator_mag = 15
else:
float_fmt = '%2.1f' if magnitude % 3 == 1 else '%1.2f'
illion = (magnitude + 1) // 3
format_str = float_fmt + ['', 'K', 'M', 'G', 'T', 'P'][illion]
return (format_str % (x * 1.0 / (1024 ** illion))).lstrip('0')
How about a simple 2 liner:
def humanizeFileSize(filesize):
p = int(math.floor(math.log(filesize, 2)/10))
return "%.3f%s" % (filesize/math.pow(1024,p), ['B','KiB','MiB','GiB','TiB','PiB','EiB','ZiB','YiB'][p])
Here is how it works under the hood:
Kb
, so the answer should be X KiB)file_size/value_of_closest_unit
along with unit.It however doesn't work if filesize is 0 or negative (because log is undefined for 0 and -ve numbers). You can add extra checks for them:
def humanizeFileSize(filesize):
filesize = abs(filesize)
if (filesize==0):
return "0 Bytes"
p = int(math.floor(math.log(filesize, 2)/10))
return "%0.2f %s" % (filesize/math.pow(1024,p), ['Bytes','KiB','MiB','GiB','TiB','PiB','EiB','ZiB','YiB'][p])
Examples:
>>> humanizeFileSize(538244835492574234)
'478.06 PiB'
>>> humanizeFileSize(-924372537)
'881.55 MiB'
>>> humanizeFileSize(0)
'0 Bytes'
NOTE - There is a difference between Kb and KiB. KB means 1000 bytes, whereas KiB means 1024 bytes. KB,MB,GB are all multiples of 1000, whereas KiB, MiB, GiB etc are all multiples of 1024. More about it here
Drawing from all the previous answers, here is my take on it. It's an object which will store the file size in bytes as an integer. But when you try to print the object, you automatically get a human readable version.
class Filesize(object):
"""
Container for a size in bytes with a human readable representation
Use it like this::
>>> size = Filesize(123123123)
>>> print size
'117.4 MB'
"""
chunk = 1024
units = ['bytes', 'KB', 'MB', 'GB', 'TB', 'PB']
precisions = [0, 0, 1, 2, 2, 2]
def __init__(self, size):
self.size = size
def __int__(self):
return self.size
def __str__(self):
if self.size == 0: return '0 bytes'
from math import log
unit = self.units[min(int(log(self.size, self.chunk)), len(self.units) - 1)]
return self.format(unit)
def format(self, unit):
if unit not in self.units: raise Exception("Not a valid file size unit: %s" % unit)
if self.size == 1 and unit == 'bytes': return '1 byte'
exponent = self.units.index(unit)
quotient = float(self.size) / self.chunk**exponent
precision = self.precisions[exponent]
format_string = '{:.%sf} {}' % (precision)
return format_string.format(quotient, unit)
Modern Django have self template tag filesizeformat
:
Formats the value like a human-readable
file size (i.e. '13 KB', '4.1 MB', '102 bytes', etc.).
For example:
{{ value|filesizeformat }}
If value is 123456789, the output would be 117.7 MB.
More info: https://docs.djangoproject.com/en/1.10/ref/templates/builtins/#filesizeformat
What you're about to find below is by no means the most performant or shortest solution among the ones already posted. Instead, it focuses on one particular issue that many of the other answers miss.
Namely the case when input like 999_995
is given:
Python 3.6.1 ...
...
>>> value = 999_995
>>> base = 1000
>>> math.log(value, base)
1.999999276174054
which, being truncated to the nearest integer and applied back to the input gives
>>> order = int(math.log(value, base))
>>> value/base**order
999.995
This seems to be exactly what we'd expect until we're required to control output precision. And this is when things start to get a bit difficult.
With the precision set to 2 digits we get:
>>> round(value/base**order, 2)
1000 # K
instead of 1M
.
How can we counter that?
Of course, we can check for it explicitly:
if round(value/base**order, 2) == base:
order += 1
But can we do better? Can we get to know which way the order
should be cut before we do the final step?
It turns out we can.
Assuming 0.5 decimal rounding rule, the above if
condition translates into:
resulting in
def abbreviate(value, base=1000, precision=2, suffixes=None):
if suffixes is None:
suffixes = ['', 'K', 'M', 'B', 'T']
if value == 0:
return f'{0}{suffixes[0]}'
order_max = len(suffixes) - 1
order = log(abs(value), base)
order_corr = order - int(order) >= log(base - 0.5/10**precision, base)
order = min(int(order) + order_corr, order_max)
factored = round(value/base**order, precision)
return f'{factored:,g}{suffixes[order]}'
giving
>>> abbreviate(999_994)
'999.99K'
>>> abbreviate(999_995)
'1M'
>>> abbreviate(999_995, precision=3)
'999.995K'
>>> abbreviate(2042, base=1024)
'1.99K'
>>> abbreviate(2043, base=1024)
'2K'
To get the file size in a human readable form, I created this function:
import os
def get_size(path):
size = os.path.getsize(path)
if size < 1024:
return f"{size} bytes"
elif size < 1024*1024:
return f"{round(size/1024, 2)} KB"
elif size < 1024*1024*1024:
return f"{round(size/(1024*1024), 2)} MB"
elif size < 1024*1024*1024*1024:
return f"{round(size/(1024*1024*1024), 2)} GB"
>>> get_size("a.txt")
1.4KB
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With