https://api.travis-ci.org/tbielawa/bitmath.png https://coveralls.io/repos/tbielawa/bitmath/badge.png?branch=master

bitmath

bitmath simplifies many facets of interacting with file sizes in various units. Functionality includes:

  • Converting between SI and NIST prefix units (GiB to kB)
  • Converting between units of the same type (SI to SI, or NIST to NIST)
  • Basic arithmetic operations (subtracting 42KiB from 50GiB)
  • Rich comparison operations (1024 Bytes == 1KiB)
  • bitwise operations (<<, >>, &, |, ^)
  • Sorting
  • Automatic human-readable prefix selection (like in hurry.filesize)

In addition to the conversion and math operations, bitmath provides human readable representations of values which are suitable for use in interactive shells as well as larger scripts and applications. The format produced for these representations is customizable via the functionality included in stdlibs string.format.

In discussion we will refer to the NIST units primarily. I.e., instead of “megabyte” we will refer to “mebibyte”. The former is 10^3 = 1,000,000 bytes, whereas the second is 2^20 = 1,048,576 bytes. When you see file sizes or transfer rates in your web browser, most of the time what you’re really seeing are the base-2 sizes/rates.

Don’t Forget! The source for bitmath is available on GitHub.

OH! And did we mention it has 150+ unittests? Check them out for yourself.

Examples After The TOC

Contents

The bitmath Module

Functions

This section describes utility functions included in the bitmath module.

bitmath.getsize()

bitmath.getsize(path[, bestprefix=True[, system=NIST]])

Return a bitmath instance representing the size of a file at any given path.

Parameters:
  • path (string) – The path of a file to read the size of
  • bestprefix (bool) – Default: True, the returned instance will be in the best human-readable prefix unit. If set to False the result is a bitmath.Byte instance.
  • system (One of bitmath.NIST or bitmath.SI) – Default: bitmath.NIST. The preferred system of units for the returned instance.

Internally bitmath.getsize() calls os.path.realpath() before calling os.path.getsize() on any paths.

Here’s an example of where we’ll run bitmath.getsize() on the bitmath source code using the defaults for bestprefix and system:

>>> import bitmath
>>> print bitmath.getsize('./bitmath/__init__.py')
33.3583984375 KiB

Let’s say we want to see the results in bytes. We can do this by setting bestprefix to False:

>>> import bitmath
>>> print bitmath.getsize('./bitmath/__init__.py', bestprefix=False)
34159.0 Byte

Recall, the default for representation is with the best human-readable prefix. We can control the prefix system used by setting system to either bitmath.NIST (the default) or bitmath.SI:

1
2
3
4
5
6
>>> print bitmath.getsize('./bitmath/__init__.py')
33.3583984375 KiB
>>> print bitmath.getsize('./bitmath/__init__.py', system=bitmath.NIST)
33.3583984375 KiB
>>> print bitmath.getsize('./bitmath/__init__.py', system=bitmath.SI)
34.159 kB

We can see in lines 14 that the same result is returned when system is not set and when system is set to bitmath.NIST (the default).

New in version 1.0.7.

bitmath.listdir()

bitmath.listdir(search_base[, followlinks=False[, filter='*'[, relpath=False[, bestprefix=False[, system=NIST]]]]])

This is a generator which recurses a directory tree yielding 2-tuples of:

  • The absolute/relative path to a discovered file
  • A bitmath instance representing the apparent size of the file
Parameters:
  • search_base (string) – The directory to begin walking down
  • followlinks (bool) – Default: False, do not follow links. Whether or not to follow symbolic links to directories. Setting to True enables directory link following
  • filter (string) – Default: * (everything). A glob to filter results with. See fnmatch for more details about globs
  • relpath (bool) – Default: False, returns the fully qualified to each discovered file. True to return the relative path from the present working directory to the discovered file. If relpath is False, then bitmath.listdir() internally calls os.path.realpath() to normalize path references
  • bestprefix (bool) – Default: False, returns bitmath.Byte instances. Set to True to return the best human-readable prefix unit for representation
  • system (One of bitmath.NIST or bitmath.SI) – Default: bitmath.NIST. Set a prefix preferred unit system. Requires bestprefix is True

Note

  • This function does not return tuples for directory entities. Including directories in results is scheduled for introduction in the upcoming 1.1.0 release.
  • Symlinks to files are followed automatically

When interpreting the results from this function it is crucial to understand exactly which items are being taken into account, what decisions were made to select those items, and how their sizes are measured.

Results from this function may seem invalid when directly compared to the results from common command line utilities, such as du, or tree.

Let’s pretend we have a directory structure like the following:

some_files/
├── deeper_files/
│   └── second_file
└── first_file

Where some_files/ is a directory, and so is some_files/deeper_files/. There are two regular files in this tree:

  • somefiles/first_file - 1337 Bytes
  • some_files/deeper_files/second_file - 13370 Bytes

The total size of the files in this tree is 1337 + 13370 = 14707 bytes.

Let’s call bitmath.listdir() on the some_files/ directory and see what the results look like. First we’ll use all the default parameters, then we’ll set relpath to True:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
>>> import bitmath
>>> for f in bitmath.listdir('./some_files'):
...     print f
...
('/tmp/tmp.P5lqtyqwPh/some_files/first_file', Byte(1337.0))
('/tmp/tmp.P5lqtyqwPh/some_files/deeper_files/second_file', Byte(13370.0))
>>> for f in bitmath.listdir('./some_files', relpath=True):
...     print f
...
('some_files/first_file', Byte(1337.0))
('some_files/deeper_files/second_file', Byte(13370.0))

On lines 5 and 6 the results print the full path, whereas on lines 10 and 11 the path is relative to the present working directory.

Let’s play with the filter parameter now. Let’s say we only want to include results for files whose name begins with “second”:

>>> for f in bitmath.listdir('./some_files', filter='second*'):
...     print f
...
('/tmp/tmp.P5lqtyqwPh/some_files/deeper_files/second_file', Byte(13370.0))

If we wish to avoid having to write for-loops, we can collect the results into a list rather simply:

>>> files = list(bitmath.listdir('./some_files'))
>>> print files
[('/tmp/tmp.P5lqtyqwPh/some_files/first_file', Byte(1337.0)), ('/tmp/tmp.P5lqtyqwPh/some_files/deeper_files/second_file', Byte(13370.0))]

Here’s a more advanced example where we will sum the size of all the returned results and then play around with the possible formatting. Recall that a bitmath instance representing the size of the discovered file is the second item in each returned tuple.

>>> discovered_files = [f[1] for f in bitmath.listdir('./some_files')]
>>> print discovered_files
[Byte(1337.0), Byte(13370.0)]
>>> print reduce(lambda x,y: x+y, discovered_files)
14707.0 Byte
>>> print reduce(lambda x,y: x+y, discovered_files).best_prefix()
14.3623046875 KiB
>>> print reduce(lambda x,y: x+y, discovered_files).best_prefix().format("{value:.3f} {unit}")
14.362 KiB

New in version 1.0.7.

bitmath.parse_string()

bitmath.parse_string(str_repr)

Parse a string representing a unit into a proper bitmath object. All non-string inputs are rejected and will raise a ValueError. Strings without units are also rejected. See the examples below for additional clarity.

Parameters:str_repr (string) – The string to parse. May contain whitespace between the value and the unit.
Returns:A bitmath object representing str_repr
Raises ValueError:
 if str_repr can not be parsed

A simple usage example:

>>> import bitmath
>>> a_dvd = bitmath.parse_string("4.7 GiB")
>>> print type(a_dvd)
<class 'bitmath.GiB'>
>>> print a_dvd
4.7 GiB

Caution

Caution is advised if you are reading values from an unverified external source, such as output from a shell command or a generated file. Many applications (even /usr/bin/ls) still do not produce file size strings with valid (or even correct) prefix units.

To protect your application from unexpected runtime errors it is recommended that calls to bitmath.parse_string() are wrapped in a try statement:

>>> import bitmath
>>> try:
...     a_dvd = bitmath.parse_string("4.7 G")
... except ValueError:
...    print "Error while parsing string into bitmath object"
...
Error while parsing string into bitmath object

Here we can see some more examples of invalid input, as well as two acceptable inputs:

>>> import bitmath
>>> sizes = [ 1337, 1337.7, "1337", "1337.7", "1337 B", "1337B" ]
>>> for size in sizes:
...     try:
...         print "Parsed size into %s" % bitmath.parse_string(size).best_prefix()
...     except ValueError:
...         print "Could not parse input: %s" % size
...
Could not parse input: 1337
Could not parse input: 1337.7
Could not parse input: 1337
Could not parse input: 1337.7
Parsed size into 1.3056640625 KiB
Parsed size into 1.3056640625 KiB

New in version 1.1.0.

Context Managers

This section describes all of the context managers provided by the bitmath class.

Note

For a bit of background, a context manager (specifically, the with statement) is a feature of the Python language which is commonly used to:

  • Decorate, or wrap, an arbitrary block of code. I.e., effect a certain condition onto a specific body of code
  • Automatically open and close an object which is used in a specific context. I.e., handle set-up and tear-down of objects in the place they are used.

See also

PEP 343
The “with” Statement
PEP 318
Decorators for Functions and Methods

bitmath.format()

bitmath.format([fmt_str=None[, plural=False[, bestprefix=False]]])

The bitmath.format() context manager allows you to specify the string representation of all bitmath instances within a specific block of code.

This is effectively equivalent to applying the format() method to an entire region of code.

Parameters:
  • fmt_str (str) – a formatting mini-language compat formatting string. See the instances attributes for a list of available items.
  • plural (bool) – True enables printing instances with trailing s‘s if they’re plural. False (default) prints them as singular (no trailing ‘s’)
  • bestprefix (bool) – True enables printing instances in their best human-readable representation. False, the default, prints instances using their current prefix unit.

Note

The bestprefix parameter is not yet implemented!

Let’s look at an example of toggling pluralization on and off. First we’ll look over a demonstration script (below), and then we’ll review the output.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import bitmath

a_single_bit = bitmath.Bit(1)
technically_plural_bytes = bitmath.Byte(0)
always_plural_kbs = bitmath.kb(42)

formatting_args = {
    'not_plural': a_single_bit,
    'technically_plural': technically_plural_bytes,
    'always_plural': always_plural_kbs
}

print """None of the following will be pluralized, because that feature is turned off
"""

test_string = """   One unit of 'Bit': {not_plural}

   0 of a unit is typically said pluralized in US English: {technically_plural}

   several items of a unit will always be pluralized in normal US English
   speech: {always_plural}"""

print test_string.format(**formatting_args)

print """
----------------------------------------------------------------------
"""

print """Now, we'll use the bitmath.format() context manager
to print the same test string, but with pluralization enabled.
"""

with bitmath.format(plural=True):
    print test_string.format(**formatting_args)

The context manager is demonstrated in lines 3334. In these lines we use the bitmath.format() context manager, setting plural to True, to print the original string again. By doing this we have enabled pluralized string representations (where appropriate). Running this script would have the following output:

None of the following will be pluralized, because that feature is turned off

   One unit of 'Bit': 1.0 Bit

   0 of a unit is typically said pluralized in US English: 0.0 Byte

   several items of a unit will always be pluralized in normal US English
   speech: 42.0 kb

----------------------------------------------------------------------

Now, we'll use the bitmath.format() context manager
to print the same test string, but with pluralization enabled.

   One unit of 'Bit': 1.0 Bit

   0 of a unit is typically said pluralized in US English: 0.0 Bytes

   several items of a unit will always be pluralized in normal US English
   speech: 42.0 kbs

Here’s a shorter example, where we’ll:

  • Print a string containing bitmath instances using the default formatting (lines 23)
  • Use the context manager to print the instances in scientific notation (lines 47)
  • Print the string one last time to demonstrate how the formatting automatically returns to the default format (lines 89)
1
2
3
4
5
6
7
8
9
>>> import bitmath
>>> print "Some instances: %s, %s" % (bitmath.KiB(1 / 3.0), bitmath.Bit(512))
Some instances: 0.333333333333 KiB, 512.0 Bit
>>> with bitmath.format("{value:e}-{unit}"):
...     print "Some instances: %s, %s" % (bitmath.KiB(1 / 3.0), bitmath.Bit(512))
...
Some instances: 3.333333e-01-KiB, 5.120000e+02-Bit
>>> print "Some instances: %s, %s" % (bitmath.KiB(1 / 3.0), bitmath.Bit(512))
Some instances: 0.333333333333 KiB, 512.0 Bit

New in version 1.0.8.

Module Variables

This section describes the module-level variables. Some of which are constants and are used for reference. Some of which effect output or behavior.

Changed in version 1.0.7: The formatting strings were not available for manupulate/inspection in earlier versions

Note

Modifying these variables will change the default representation indefinitely. Use the bitmath.format() context manager to limit changes to a specific block of code.

bitmath.format_string

This is the default string representation of all bitmath instances. The default value is {value} {unit} which, when evaluated, formats an instance as a floating point number with at least one digit of precision, followed by a character of whitespace, followed by the prefix unit of the instance.

For example, given bitmath instances representing the following values: 1337 MiB, 0.1234567 kb, and 0 B, their printed output would look like the following:

>>> from bitmath import *
>>> print MiB(1337), kb(0.1234567), Byte(0)
1337.0 MiB 0.1234567 kb 0.0 Byte

We can make these instances print however we want to. Let’s wrap each one in square brackets ([, ]), replace the separating space character with a hyphen (-), and limit the precision to just 2 digits:

>>> import bitmath
>>> bitmath.format_string = "[{value:.2f}-{unit}]"
>>> print bitmath.MiB(1337), bitmath.kb(0.1234567), bitmath.Byte(0)
[1337.00-MiB] [0.12-kb] [0.00-Byte]
bitmath.format_plural

A boolean which controls the pluralization of instances in string representation. The default is False.

If we wanted to enable pluralization we could set the format_plural variable to True. First, let’s look at some output using the default singular formatting.

>>> import bitmath
>>> print bitmath.MiB(1337)
1337.0 MiB

And now we’ll enable pluralization (line 2):

1
2
3
4
5
6
7
>>> import bitmath
>>> bitmath.format_plural = True
>>> print bitmath.MiB(1337)
1337.0 MiBs
>>> bitmath.format_plural = False
>>> print bitmath.MiB(1337)
1337.0 MiB

On line 5 we disable pluralization again and then see that the output has no trailing “s” character.

bitmath.NIST

Constant used as an argument to some functions to specify the NIST system.

bitmath.SI

Constant used as an argument to some functions to specify the SI system.

bitmath.SI_PREFIXES

An array of all of the SI unit prefixes (e.g., k, M, or E)

bitmath.SI_STEPS
SI_STEPS = {
    'Bit': 1 / 8.0,
    'Byte': 1,
    'k': 1000,
    'M': 1000000,
    'G': 1000000000,
    'T': 1000000000000,
    'P': 1000000000000000,
    'E': 1000000000000000000
}
bitmath.NIST_PREFIXES

An array of all of the NIST unit prefixes (e.g., Ki, Mi, or Ei)

bitmath.NIST_STEPS
NIST_STEPS = {
    'Bit': 1 / 8.0,
    'Byte': 1,
    'Ki': 1024,
    'Mi': 1048576,
    'Gi': 1073741824,
    'Ti': 1099511627776,
    'Pi': 1125899906842624,
    'Ei': 1152921504606846976
}

Classes

Initializing

class bitmath.BitMathType([value=0[, bytes=None[, bits=None]]])

The value, bytes, and bits parameters are mutually exclusive. That is to say, you cannot instantiate a bitmath class using more than one of the parameters. Omitting any keyword argument defaults to behaving as if value was provided.

Parameters:
  • value (int) – Default: 0. The value of the instance in prefix units. For example, if we were instantiating a bitmath.KiB object to represent 13.37 KiB, the value parameter would be 13.37. For instance, k = bitmath.KiB(13.37).
  • bytes (int) – The value of the instance as measured in bytes.
  • bits (int) – The value of the instance as measured in bits.
Raises ValueError:
 

if more than one parameter is provided.

The following code block demonstrates the 4 acceptable ways to instantiate a bitmath class.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
>>> import bitmath

# Omitting all keyword arguments defaults to 'value' behavior.
>>> a = bitmath.KiB(1)

# This is equivalent to the previous statement
>>> b = bitmath.KiB(value=1)

# We can also specify the initial value in bytes.
# Recall, 1KiB = 1024 bytes
>>> c = bitmath.KiB(bytes=1024)

# Finally, we can specify exact number of bits in the
# instance. Recall, 1024B = 8192b
>>> d = bitmath.KiB(bits=8192)

>>> a == b == c == d
True

Available Classes

There are two fundamental classes available, the Bit and the Byte.

There are 24 other classes available, representing all the prefix units from k through e (kilo/kibi through exa/exbi).

Classes with ‘i’ in their names are NIST type classes. They were defined by the National Institute of Standards and Technology (NIST) as the ‘Binary Prefix Units’. They are defined by increasing powers of 2.

Classes without the ‘i’ character are SI type classes. Though not formally defined by any standards organization, they follow the International System of Units (SI) pattern (commonly used to abbreviate base 10 values). You may hear these referred to as the “Decimal” or “SI” prefixes.

Classes ending with lower-case ‘b’ characters are bit based. Classes ending with upper-case ‘B’ characters are byte based. Class inheritance is shown below in parentheses to make this more apparent:

NIST SI
Eib(Bit) Eb(Bit)
EiB(Byte) EB(Byte)
Gib(Bit) Gb(Bit)
GiB(Byte) GB(Byte)
Kib(Bit) kb(Bit)
KiB(Byte) kB(Byte)
Mib(Bit) Mb(Bit)
MiB(Byte) MB(Byte)
Pib(Bit) Pb(Bit)
PiB(Byte) PB(Byte)
Tib(Bit) Tb(Bit)
TiB(Byte) TB(Byte)

Note

As per SI definition, the kB and kb classes begins with a lower-case k character.

The majority of the functionality of bitmath object comes from their rich implementation of standard Python operations. You can use bitmath objects in almost all of the places you would normally use an integer or a float. See the Table of Supported Operations and Appendix: Rules for Math for more details.

Class Methods

Class Method: from_other()

bitmath class objects have one public class method, BitMathClass.from_other() which provides an alternative way to initialize a bitmath class.

This method may be called on bitmath class objects directly. That is to say: you do not need to call this method on an instance of a bitmath class, however that is a valid use case.

classmethod BitMathClass.from_other(item)

Instantiate any BitMathClass using another instance as reference for it’s initial value.

The from_other() class method has one required parameter: an instance of a bitmath class.

Parameters:item (BitMathInstance) – An instance of a bitmath class.
Returns:a bitmath instance of type BitMathClass equivalent in value to item
Return type:BitMathClass
Raises TypeError:
 if item is not a valid bitmath class

In pure Python, this could also be written as:

1
2
3
4
5
6
7
8
9
In [1]: a_mebibyte = MiB(1)

In [2]: a_mebibyte_sized_kibibyte = KiB(bytes=a_mebibyte.bytes)

In [3]: a_mebibyte == a_mebibyte_sized_kibibyte
Out[3]: True

In [4]: print a_mebibyte, a_mebibyte_sized_kibibyte
1.0MiB 1024.0KiB

Instances

Instance Attributes

bitmath objects have several instance attributes:

BitMathInstance.base

The mathematical base of the unit of the instance (this will be 2 or 10)

>>> b = bitmath.Byte(1337)
>>> print b.base
2
BitMathInstance.binary

The Python binary representation of the instance’s value (in bits)

>>> b = bitmath.Byte(1337)
>>> print b.binary
0b10100111001000
BitMathInstance.bin

This is an alias for binary

BitMathInstance.bits

The number of bits in the object

>>> b = bitmath.Byte(1337)
>>> print b.bits
10696.0
BitMathInstance.bytes

The number of bytes in the object

>>> b = bitmath.Byte(1337)
>>> print b.bytes
1337
BitMathInstance.power

The mathematical power the base of the unit of the instance is raised to

>>> b = bitmath.Byte(1337)
>>> print b.power
0
BitMathInstance.system

The system of units used to measure this instance (NIST or SI)

>>> b = bitmath.Byte(1337)
>>> print b.system
NIST
BitMathInstance.value

The value of the instance in prefix units1

>>> b = bitmath.Byte(1337)
>>> print b.value
1337.0
BitMathInstance.unit

The string representation of this prefix unit (such as MiB or kb)

>>> b = bitmath.Byte(1337)
>>> print b.unit
Byte
BitMathInstance.unit_plural

The pluralized string representation of this prefix unit.

>>> b = bitmath.Byte(1337)
>>> print b.unit_plural
Bytes
BitMathInstance.unit_singular

The singular string representation of this prefix unit (such as MiB or kb)

>>> b = bitmath.Byte(1337)
>>> print b.unit_singular
Byte

Notes:

  1. Given an instance k, where k = KiB(1.3), then k.value is 1.3

The following is an example of how to access some of these attributes and what you can expect their printed representation to look like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
In [13]: dvd_capacity = GB(4.7)

In [14]: print "Capacity in bits: %s\nbytes: %s\n" % \
             (dvd_capacity.bits, dvd_capacity.bytes)

   Capacity in bits: 37600000000.0
   bytes: 4700000000.0

In [15]: dvd_capacity.value

Out[16]: 4.7

In [17]: dvd_capacity.bin

Out[17]: '0b100011000001001000100111100000000000'

In [18]: dvd_capacity.binary

Out[18]: '0b100011000001001000100111100000000000'

Instance Methods

bitmath objects come with a few basic methods: to_THING(), format(), and best_prefix().

to_THING()

Like the available classes, there are 24 to_THING() methods available. THING is any of the bitmath classes. You can even to_THING() an instance into itself again:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
In [1]: from bitmath import *

In [2]: one_mib = MiB(1)

In [3]: one_mib_in_kb = one_mib.to_kb()

In [4]: one_mib == one_mib_in_kb

Out[4]: True

In [5]: another_mib = one_mib.to_MiB()

In [6]: print one_mib, one_mib_in_kb, another_mib

1.0 MiB 8388.608 kb 1.0 MiB

In [7]: six_TB = TB(6)

In [8]: six_TB_in_bits = six_TB.to_Bit()

In [9]: print six_TB, six_TB_in_bits

6.0 TB 4.8e+13 Bit

In [10]: six_TB == six_TB_in_bits

Out[10]: True

best_prefix()

best_prefix([system=None])

Return an equivalent instance which uses the best human-readable prefix-unit to represent it.

Parameters:system (int) – one of bitmath.NIST or bitmath.SI
Returns:An equivalent bitmath instance
Return type:bitmath
Raises ValueError:
 if an invalid unit system is given for system

The best_prefix() method returns the result of converting a bitmath instance into an equivalent instance using a prefix unit that better represents the original value. Another way to think of this is automatic discovery of the most sane, or human readable, unit to represent a given size. This functionality is especially important in the domain of interactive applications which need to report file sizes or transfer rates to users.

As an analog, consider you have 923,874,434¢ in your bank account. You probably wouldn’t want to read your bank statement and find your balance in pennies. Most likely, your bank statement would read a balance of $9,238,744.34. In this example, the input prefix is the cent: ¢. The best prefix for this is the dollar: $.

Let’s, for example, say we are reporting a transfer rate in an interactive application. It’s important to present this information in an easily consumable format. The library we’re using to calculate the rate of transfer reports the rate in bytes per second from a tx_rate() function.

We’ll use this example twice. In the first occurrence, we will print out the transfer rate in a more easily digestible format than pure bytes/second. In the second occurrence we’ll take it a step further, and use the format method to make the output even easier to read.

In [9]: for _rate in tx_rate():
    print "Rate: %s/second" % Bit(_rate)
    time.sleep(1)

Rate: 100.0 Bit/sec
Rate: 24000.0 Bit/sec
Rate: 1024.0 Bit/sec
Rate: 60151.0 Bit/sec
Rate: 33.0 Bit/sec
Rate: 9999.0 Bit/sec
Rate: 9238742.0 Bit/sec
Rate: 2.09895849555e+13 Bit/sec
Rate: 934098021.0 Bit/sec
Rate: 934894.0 Bit/sec

And now using a custom formatting definition:

In [50]: for _rate in tx_rate():
    print Bit(_rate).best_prefix().format("Rate: {value:.3f} {unit}/sec")
    time.sleep(1)

Rate: 12.500 Byte/sec
Rate: 2.930 KiB/sec
Rate: 128.000 Byte/sec
Rate: 7.343 KiB/sec
Rate: 4.125 Byte/sec
Rate: 1.221 KiB/sec
Rate: 1.101 MiB/sec
Rate: 2.386 TiB/sec
Rate: 111.353 MiB/sec
Rate: 114.123 KiB/sec

format()

BitMathInstance.format(fmt_spec)

Return a custom-formatted string to represent this instance.

Parameters:fmt_spec (str) – A valid formatting mini-language string
Returns:The custom formatted representation
Return type:string

bitmath instances come with a verbose built-in string representation:

In [1]: leet_bits = Bit(1337)

In [2]: print leet_bits
1337.0 Bit

However, for instances which aren’t whole numbers (as in MiB(1/3.0) == 0.333333333333 MiB, etc), their representation can be undesirable.

The format() method gives you complete control over the instance’s representation. All of the instances attributes are available to use when choosing a representation.

The following sections describe some common use cases of the format() method as well as provide a brief tutorial of the greater Python formatting meta-language.

Setting Decimal Precision

By default, bitmath instances will print to a fairly long precision for values which are not whole multiples of their prefix unit. In most use cases, simply printing out the first 2 or 3 digits of precision is acceptable.

The following examples will show us how to print out a bitmath instance in a more human readable way, by limiting the decimal precision to 2 digits.

First, for reference, the default formatting:

In [1]: ugly_number = MB(50).to_MiB() / 8.0
In [2]: print ugly_number
5.96046447754 MiB

Now, let’s use the format() method to limit that to two digits of precision:

In [3]: print ugly_number.format("{value:.2f}{unit}")
5.96 MiB

By changing the 2 character, you increase or decrease the precision. Set it to 0 ({value:.0f}) and you have what effectively looks like an integer.

Format All the Instance Attributes

The following example prints out every instance attribute. Take note of how an attribute may be referenced multiple times.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
In [8]: longer_format = """Formatting attributes for %s
   ...: This instances prefix unit is {unit}, which is a {system} type unit
   ...: The unit value is {value}
   ...: This value can be truncated to just 1 digit of precision: {value:.1f}
   ...: In binary this looks like: {binary}
   ...: The prefix unit is derived from a base of {base}
   ...: Which is raised to the power {power}
   ...: There are {bytes} bytes in this instance
   ...: The instance is {bits} bits large
   ...: bytes/bits without trailing decimals: {bytes:.0f}/{bits:.0f}""" % str(ugly_number)

In [9]: print ugly_number.format(longer_format)
Formatting attributes for 5.96046447754 MiB
This instances prefix unit is MiB, which is a NIST type unit
The unit value is 5.96046447754
This value can be truncated to just 1 digit of precision: 6.0
In binary this looks like: 0b10111110101111000010000000
The prefix unit is derived from a base of 2
Which is raised to the power 20
There are 6250000.0 bytes in this instance
The instance is 50000000.0 bits large
bytes/bits without trailing decimals: 6250000/50000000

Note

On line 4 we print with 1 digit of precision, on line 16 we see the value has been rounded to 6.0

Instance Properties

THING Properties

Like the available classes, there are 24 THING properties available. THING is any of the bitmath classes. Under the covers these properties call to_THING.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
In [1]: from bitmath import *

In [2]: one_mib = MiB(1)

In [3]: one_mib == one_mib.kb

Out[3]: True

In [4]: print one_mib, one_mib.kb, one_mib.MiB

1.0 MiB 8388.608 kb 1.0 MiB

In [5]: six_TB = TB(6)

In [6]: print six_TB, six_TB.Bit

6.0 TB 4.8e+13 Bit

In [7]: six_TB == six_TB.Bit

Out[7]: True

The Formatting Mini-Language

That is all you begin printing numbers with custom precision. If you want to learn a little bit more about using the formatting mini-language, read on.

You may be asking yourself where these {value:.2f} and {unit} strings came from. These are part of the Format Specification Mini-Language which is part of the Python standard library. To be explicitly clear about what’s going on here, let’s break the first specifier ({value:.2f}) down into it’s component parts:

{value:.2f}
   |  |||
   |  |||\---- The "f" says to format this as a floating point type
   |  ||\----- The 2 indicates we want 2 digits of precision (default is 6)
   |  |\------ The '.' character must precede the precision specifier for floats
   |  \------- The : separates the attribute name from the formatting specification
   \---------- The name of the attribute to print

The second specifier ({unit}) says to format the unit attribute as a string (string is the default type when no type is given).

See also

Python String Format Cookbook
Marcus Kazmierczak’s excellent introduction to string formatting

Getting Started

In this section we will take a high-level look at the basic things you can do with bitmath. We’ll include the following topics:

Tables of Supported Operations

The following legend describes the two operands used in the tables below.

Operand Description
bm A bitmath object is required
num An integer or decimal value is required

Arithmetic

Math works mostly like you expect it to, except for a few edge-cases:

  • Mixing bitmath types with Number types (the result varies per-operation)
  • Operations where two bitmath types would cancel out (such as dividing two bitmath types)
  • Multiplying two bitmath instances together is supported, but the results may not make much sense.

See also

Appendix: Rules for Math
For a discussion of the behavior of bitmath and number types.
Operation Parameters Result Type Example
Addition bm1 + bm2 type(bm1) KiB(1) + MiB(2) = 2049.0KiB
Addition bm + num type(num) KiB(1) + 1 = 2.0
Addition num + bm type(num) 1 + KiB(1) = 2.0
Subtraction bm1 - bm2 type(bm1) KiB(1) - Byte(2048) = -1.0KiB
Subtraction bm - num type(num) KiB(4) - 1 = 3.0
Subtraction num - bm type(num) 10 - KiB(1) = 9.0
Multiplication bm1 * bm2 type(bm1) KiB(1) * KiB(2) = 2048.0KiB
Multiplication bm * num type(bm) KiB(2) * 3 = 6.0KiB
Multiplication num * bm type(bm) 2 * KiB(3) = 6.0KiB
Division bm1 / bm2 type(num) KiB(1) / KiB(2) = 0.5
Division bm / num type(bm) KiB(1) / 3 = 0.3330078125KiB
Division num / bm type(num) 3 / KiB(2) = 1.5

Bitwise Operations

See also

Bitwise Calculator
A free online calculator for checking your math

Bitwise operations are also supported. Bitwise operations work directly on the bits attribute of a bitmath instance, not the number you see in an instances printed representation (value), to maintain accuracy.

Operation Parameters Result Type Example1
Left Shift bm << num type(bm) MiB(1) << 2 = MiB(4.0)
Right Shift bm >> num type(bm) MiB(1) >> 2 = MiB(0.25)
AND bm & num type(bm) MiB(13.37) & 1337 = MiB(0.000126...)
OR bm | num type(bm) MiB(13.37) | 1337 = MiB(13.3700...)
XOR bm ^ num type(bm) MiB(13.37) ^ 1337 = MiB(13.369...)
  1. Give me a break here, it’s not easy coming up with compelling examples for bitwise operations...

Basic Math

bitmath supports all arithmetic operations

1
2
3
4
5
6
7
8
9
In [7]: eighty_four_mib = fourty_two_mib + fourty_two_mib_in_kib

In [8]: eighty_four_mib

Out[8]: MiB(84.0)

In [9]: eighty_four_mib == fourty_two_mib * 2

Out[9]: True

Unit Conversion

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
In [1]: from bitmath import *

In [2]: fourty_two_mib = MiB(42)

In [3]: fourty_two_mib_in_kib = fourty_two_mib.to_KiB()

In [4]: fourty_two_mib_in_kib

Out[4]: KiB(43008.0)

In [5]: fourty_two_mib

Out[5]: MiB(42.0)

In [6]: fourty_two_mib.KiB

Out[6]: KiB(43008.0)

Rich Comparison

Rich Comparison (as per the Python Basic Customization magic methods): <, <=, ==, !=, >, >= is fully supported:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
In [2]: GB(1) < GiB(1)
Out[2]: True

In [3]: GB(1.073741824) == GiB(1)
Out[3]: True

In [4]: GB(1.073741824) <= GiB(1)
Out[4]: True

In [5]: Bit(1) == TiB(bits=1)
Out[5]: True

In [6]: kB(100) > EiB(bytes=1024)
Out[6]: True

In [7]: kB(100) >= EiB.from_other(kB(100))
Out[7]: True

In [8]: kB(100) >= EiB.from_other(kB(99))
Out[8]: True

In [9]: kB(100) >= EiB.from_other(kB(9999))
Out[9]: False

In [10]: KiB(100) != Byte(1)
Out[10]: True

Sorting

bitmath natively supports sorting.

Let’s make a list of the size (in bytes) of all the files in the present working directory (lines 7 and 8) and then print them out sorted by increasing magnitude (lines 13 and 14, and 18 and 19):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
In [1]: from bitmath import *

In [2]: import os

In [3]: sizes = []

In [4]: for f in os.listdir('./tests/'):
            sizes.append(KiB(os.path.getsize('./tests/' + f)))

In [5]: print sizes
[KiB(7337.0), KiB(1441.0), KiB(2126.0), KiB(2178.0), KiB(2326.0), KiB(4003.0), KiB(48.0), KiB(1770.0), KiB(7892.0), KiB(4190.0)]

In [6]: print sorted(sizes)
[KiB(48.0), KiB(1441.0), KiB(1770.0), KiB(2126.0), KiB(2178.0), KiB(2326.0), KiB(4003.0), KiB(4190.0), KiB(7337.0), KiB(7892.0)]

In [7]: human_sizes = [s.best_prefix() for s in sizes]

In [8]: print sorted(human_sizes)
[KiB(48.0), MiB(1.4072265625), MiB(1.728515625), MiB(2.076171875), MiB(2.126953125), MiB(2.271484375), MiB(3.9091796875), MiB(4.091796875), MiB(7.1650390625), MiB(7.70703125)]

Now print them out in descending magnitude

In [8]: print sorted(human_sizes, reverse=True)
[KiB(7892.0), KiB(7337.0), KiB(4190.0), KiB(4003.0), KiB(2326.0), KiB(2178.0), KiB(2126.0), KiB(1770.0), KiB(1441.0), KiB(48.0)]

Real Life Examples

Download Speeds

Let’s pretend that your Internet service provider (ISP) advertises your maximum downstream as 50Mbps (50 Megabits per second)1 and you want to know how fast that is in Megabytes per second? bitmath can do that for you easily. We can calculate this as such:

1
2
3
4
5
6
7
>>> from bitmath import *

>>> downstream = Mib(50)

>>> print downstream.to_MB()

MB(6.25)

This tells us that if our ISP advertises 50Mbps we can expect to see download rates of over 6MB/sec.

  1. Assuming your ISP follows the common industry practice of using SI (base-10) units to describe file sizes/rates

Calculating how many files fit on a device

In 2001 Apple® announced the iPod™. Their headline statement boasting:

”... iPod stores up to 1,000 CD-quality songs on its super-thin 5 GB hard drive, ...”

OK. That’s pretty simple to work backwards: capacity of disk drive divided by number of songs equals the average size of a song. Which in this case is:

1
2
3
>>> song_size = GB(5) / 1000
>>> print song_size
0.005GB

Or, using best_prefix, (line 2) to generate a more human-readable form:

1
2
3
>>> song_size = GB(5) / 1000
>>> print song_size.best_prefix()
5.0MB

That’s great, if you have normal radio-length songs. But how many of our favorite jam-band’s 15-30+ minute-long songs could we fit on this iPod? Let’s pretend we did the math and the average audio file worked out to be 18.6 MiB (19.5 MB) large.

1
2
3
4
>>> ipod_capacity = GB(5)
>>> bootleg_size = MB(19.5)
>>> print ipod_capacity / bootleg_size
256.41025641

The result on line 4 tells tells us that we could fit 256 average-quality songs on our iPod.

Printing Human-Readable File Sizes in Python

In a Python script or interpreter we may wish to print out file sizes in something other than bytes (which is what os.path.getsize returns). We can use bitmath to do that too:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
>>> import os

>>> from bitmath import *

>>> these_files = os.listdir('.')

>>> for f in these_files:
...    f_size = Byte(os.path.getsize(f))
...    print "%s - %s" % (f, f_size.to_KiB())

test_basic_math.py - 3.048828125KiB
__init__.py - 0.1181640625KiB
test_representation.py - 0.744140625KiB
test_to_Type_conversion.py - 2.2119140625KiB

See also

Instance Formatting
How to print results in a prettier format

Calculating Linux BDP and TCP Window Scaling

Say we’re doing some Linux Kernel TCP performance tuning. For optimum speeds we need to calculate our BDP, or Bandwidth Delay Product. For this we need to calculate certain values to set some kernel tuning parameters to. The point of this tuning is to send the most data we can during a measured round-trip-time without sending more than can be processed. To accomplish this we are resizing our kernel read/write networking/socket buffers.

We will see two ways of doing this. The tedious manual way, and the way with bitmath.

The Hard Way

Core Networking Values

  • net.core.rmem_max - Bytes - Single Value - Default receive buffer size
  • net.core.wmem_max - Bytes - Single Value - Default write buffer size

System-Wide Memory Limits

  • net.ipv4.tcp_mem - Pages - Three Value Vector - The max field of the parameter is the number of memory pages allowed for queueing by all TCP sockets.

Per-Socket Buffers

Per-socket buffer sizes must not exceed the core networking buffer sizes.

  • net.ipv4.tcp_rmem - Bytes - Three Field Vector - The max field sets the size of the TCP receive buffer
  • net.ipv4.tcp_wmem - Bytes - Three Field Vector - As above, but for the write buffer

We would normally calculate the optimal BDP and related values following this approach:

  1. Measure the latency, or round trip time (RTT, measured in milliseconds), between the host we’re tuning and our target remote host
  2. Measure/identify our network transfer rate
  3. Calculate the BDP (multiply transfer rate by rtt)
  4. Obtain our current kernel settings
  5. Adjust settings as necessary

But for the sake brevity we’ll be working out of an example scenario with a pre-defined RTT and transfer rate.

Scenario

  • We have an average network transfer rate of 1Gb/sec (where Gb is the SI unit for Gigabits, not Gibibytes: GiB)
  • Our latency (RTT) is 0.199ms (milliseconds)

Calculate Manually

Lets calculate the BDP now. Because the kernel parameters expect values in units of bytes and pages we’ll have to convert our transfer rate of 1Gb/sec into B/s (Gigabits/second to Bytes/second):

  • Convert 1Gb into an equivalent byte based unit

Remember 1 Byte = 8 Bits:

tx_rate_GB = 1/8 = 0.125

Our equivalent transfer rate is 0.125GB/sec.

  • Convert our RTT from milliseconds into seconds

Remember 1ms = 10-3s:

window_seconds = 0.199 * 10^-3 = 0.000199

Our equivalent RTT window is 0.000199s

  • Next we multiply the transfer rate by the length of our RTT window (in seconds)

(The unit analysis for this is GB/s * s leaving us with GB)

BDP = rx_rate_GB * window_seconds = 0.125 * 0.000199 = 0.000024875

Our BDP is 0.000024875GB.

  • Convert 0.000024875GB to bytes:

Remember 1GB = 109B

BDP_bytes = 0.000024875 * 10^9 = 24875.0

Our BDP is 24875 bytes (or about 24.3KiB)

The bitmath way

All of this math can be done much quicker (and with greater accuracy) using the bitmath library. Let’s see how:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
>>> from bitmath import GB

>>> tx = 1/8.0

>>> rtt = 0.199 * 10**-3

>>> bdp = (GB(tx * rtt)).to_Byte()

>>> print bdp.to_KiB()

KiB(24.2919921875)

Note

To avoid integer rounding during division, don’t forget to divide by 8.0 rather than 8

We could shorten that even further:

>>> print (GB((1/8.0) * (0.199 * 10**-3))).to_Byte()
24875.0Byte

Get the current kernel parameters

Important to note is that the per-socket buffer sizes must not exceed the core network buffer sizes. Lets fetch our current core buffer sizes:

$ sysctl net.core.rmem_max net.core.wmem_max
net.core.rmem_max = 212992
net.core.wmem_max = 212992

Recall, these values are in bytes. What are they in KiB?

>>> print Byte(212992).to_KiB()
KiB(208.0)

This means our core networking buffer sizes are set to 208KiB each. Now let’s check our current per-socket buffer sizes:

$ sysctl net.ipv4.tcp_rmem net.ipv4.tcp_wmem
net.ipv4.tcp_rmem = 4096        87380   6291456
net.ipv4.tcp_wmem = 4096        16384   4194304

Let’s double-check that our buffer sizes aren’t already out of wack (per-socket should be <= networking core)

>>> net_core_max = KiB(bytes=212992)

>>> ipv4_tcp_rmem_max = KiB(bytes=6291456)

>>> ipv4_tcp_rmem_max > net_core_max

True

It appears that my buffers aren’t sized appropriately. We’ll fix that when we set the tunable parameters.

Finally, how large is the entire system TCP buffer?

$ sysctl net.ipv4.tcp_mem
net.ipv4.tcp_mem = 280632       374176  561264

Our max system TCP buffer size is set to 561264. Recall that this parameter is measured in memory pages. Most of the time your page size is 4096 bytes, but you can check by running the command: getconf PAGESIZE. To convert the system TCP buffer size (561264) into a byte-based unit, we’ll multiply it by our pagesize (4096):

>>> sys_pages = 561264

>>> page_size = 4096

>>> sys_buffer = Byte(sys_pages * page_size)

>>> print sys_buffer.to_MiB()

2192.4375MiB

>>> print sys_buffer.to_GiB()

2.14105224609GiB

The system max TCP buffer size is about 2.14GiB.

In review, we discovered the following:

  • Our core network buffer size is insufficient (212992), we’ll set it higher
  • Our current per-socket buffer sizes are 6291456 and 4194304

And we calculated the following:

  • Our ideal max per-socket buffer size is 24875 bytes
  • Our ideal default per-socket buffer size (half the max): 12437

Finally: Set the new kernel parameters

Set the core-network buffer sizes:

$ sudo sysctl net.core.rmem_max=24875  net.core.wmem_max=24875
net.core.rmem_max = 4235
net.core.wmem_max = 4235

Set the per-socket buffer sizes:

$ sudo sysctl net.ipv4.tcp_rmem="4096 12437 24875" net.ipv4.tcp_wmem="4096 12437 24875"
net.ipv4.tcp_rmem = 4096 12437 24875
net.ipv4.tcp_wmem = 4096 12437 24875

And it’s done! Testing this is left as an exercise for the reader. Note that in my experience this is less useful on wireless connections.

Contributing to bitmath

I should fill this in some time.

Appendices

Rules for Math

This section describes what we need to know to effectively use bitmath for arithmetic. Because bitmath allows the use of instances as operands on either side of the operator it is especially important to understand their behavior. Just as in normal every-day math, not all operations yield the same result if the operands are switched. E.g., 1 - 2 = -1 whereas 2 - 1 = 1.

This section includes discussions of the results for each supported mixed math operation. For mixed math operations (i.e., an operation with a bitmath instance and a number type), implicit coercion may happen. That is to say, a bitmath instance will be converted to a number type.

When coercion happens is determined by the following conditions and rules:

  1. Precedence and Associativity of Operators in Python[1]
  2. Situational semantics – some operations, though mathematically valid, do not make logical sense when applied to context.

Terminology

The definitions describes some of the terminology used throughout this section.

Coercion

The act of converting operands into a common type to support arithmetic operations. Somewhat similar to how adding two fractions requires coercing each operand into having a common denominator.

Specific to the bitmath domain, this concept means using an instance’s prefix value for mixed-math.

Operand
The object(s) of a mathematical operation. That is to say, given 1 + 2, the operands would be 1 and 2.
Operator
The mathematical operation to evaluate. Given 1 + 2, the operation would be addition, +.
LHS
Left-hand side. In discussion this specifically refers to the operand on the left-hand side of the operator.
RHS
Right-hand side. In discussion this specifically refers to the operand on the right-hand side of the operator.

Two bitmath operands

This section describes what happens when two bitmath instances are used as operands. There are three possible results from this type of operation.

Addition and subtraction
The result will be of the type of the LHS.
Multiplication
Supported, but yields strange results.
1
2
3
4
5
6
7
8
9
In [10]: first = MiB(5)

In [11]: second = kB(2)

In [12]: first * second
Out[12]: MiB(10000.0)

In [13]: (first * second).best_prefix()
Out[13]: GiB(9.765625)

As we can see on lines 6 and 9, multiplying even two relatively small quantities together (MiB(5) and kB(2)) yields quite large results.

Internally, this is implemented as:

\[(5 \cdot 2^{20}) \cdot (2 \cdot 10^{3}) = 10,485,760,000 B\]\[10,485,760,000 B \cdot \dfrac{1 MiB}{1,048,576 B} = 10,000 MiB\]
Division
The result will be a number type due to unit cancellation.

Mixed Types: Addition and Subtraction

This describes the behavior of addition and subtraction operations where one operand is a bitmath type and the other is a number type.

Mixed-math addition and subtraction always return a type from the numbers family (integer, float, long, etc...). This rule is true regardless of the placement of the operands, with respect to the operator.

Discussion: Why do 100 - KiB(90) and KiB(100) - 90 both yield a result of 10.0 and not another bitmath instance, such as KiB(10.0)?

When implementing the math part of the object datamodel customizations[2] there were two choices available:

  1. Offer no support at all. Instead raise a NotImplemented exception.
  2. Consistently apply coercion to the bitmath operand to produce a useful result (useful if you know the rules of the library).

In the end it became a philosophical decision guided by scientific precedence.

Put simply, bitmath uses the significance of the least significant operand, specifically the number-type operand because it lacks semantic significance. In application this means that we drop the semantic significance of the bitmath operand. That is to say, given an input like GiB(13.37) (equivalent to == 13.37 * 230), the only property used in calculations is the prefix value, 13.37.

Numbers carry mathematical significance, in the form of precision, but what they lack is semantic (contextual) significance. A number by itself is just a measurement of an arbitrary quantity of stuff. In mixed-type math, bitmath effectively treats numbers as mathematical constants.

A bitmath instance also has mathematical significance in that an instance is a measurement of a quantity (bits in this case) and that quantity has a measurable precision. But a bitmath instance is more than just a measurement, it is a specialized representation of a count of bits. This gives bitmath instances semantic significance as well.

And so, in deciding how to handle mixed-type (really what we should say is mixed-significance) operations, we chose to model the behavior off of an already established set of rules. Those rules are the Rules of Significance Arithmetic[3].

Let’s look at an example of this in action:

In [8]: num = 42

In [9]: bm = PiB(24)

In [10]: print num + bm
66.0

Equivalently, divorcing the bitmath instance from it’s value (this is coercion):

In [12]: bm_value = bm.value

In [13]: print num + bm_value
66.0

What it all boils down to is this: if we don’t provide a unit then bitmath won’t give us one back. There is no way for bitmath to guess what unit the operand was intended to carry. Therefore, the behavior of bitmath is conservative. It will meet us half way and do the math, but it will not return a unit in the result.

Mixed Types: Multiplication and Division

Multiplication has commutative properties. This means that the ordering of the operands is not significant. Because of this fact bitmath allows arbitrary placement of the operands, treating the numeric operand as a constant. Here’s an example demonstrating this.

In [2]: 10 * KiB(43)
Out[2]: KiB(430.0)

In [3]: KiB(43) * 10
Out[3]: KiB(430.0)

Division, however, does not have this commutative property. I.e., the placement of the operands is significant. Additionally, there is a semantic difference in division. Dividing a quantity (e.g. MiB(100)) by a constant (10) makes complete sense. Conceptually (in the domain of bitmath), the intention of MiB(100) / 10) is to separate MiB(10) into 10 equal sized parts.

In [4]: KiB(43) / 10
Out[4]: KiB(4.2998046875)

The reverse operation does not maintain semantic validity. Stated differently, it does not make logical sense to divide a constant by a measured quantity of stuff. If you’re still not clear on this, ask yourself what you would expect to get if you did this:

\[\dfrac{100}{kB(33)} = x\]

On Units

As previously stated, in this module you will find two very similar sets of classes available. These are the NIST and SI prefixes. The NIST prefixes are all base 2 and have an ‘i’ character in the middle. The SI prefixes are base 10 and have no ‘i’ character.

For smaller values, these two systems of unit prefixes are roughly equivalent. The round() operations below demonstrate how close in a percent one “unit” of SI is to one “unit” of NIST.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
In [15]: one_kilo = 1 * 10**3

In [16]: one_kibi = 1 * 2**10

In [17]: round(one_kilo / float(one_kibi), 2)

Out[17]: 0.98

In [18]: one_tera = 1 * 10**12

In [19]: one_tebi = 1 * 2**40

In [20]: round(one_tera / float(one_tebi), 2)

Out[20]: 0.91

In [21]: one_exa = 1 * 10**18

In [22]: one_exbi = 1 * 2**60

In [23]: round(one_exa / float(one_exbi), 2)

Out[23]: 0.87

They begin as roughly equivalent, however as you can see (lines: 7, 15, and 23), they diverge significantly for higher values.

Why two unit systems? Why take the time to point this difference out? Why should you care? The Linux Documentation Project comments on that:

Before these binary prefixes were introduced, it was fairly common to use k=1000 and K=1024, just like b=bit, B=byte. Unfortunately, the M is capital already, and cannot be capitalized to indicate binary-ness.

At first that didn’t matter too much, since memory modules and disks came in sizes that were powers of two, so everyone knew that in such contexts “kilobyte” and “megabyte” meant 1024 and 1048576 bytes, respectively. What originally was a sloppy use of the prefixes “kilo” and “mega” started to become regarded as the “real true meaning” when computers were involved. But then disk technology changed, and disk sizes became arbitrary numbers. After a period of uncertainty all disk manufacturers settled on the standard, namely k=1000, M=1000k, G=1000M.

The situation was messy: in the 14k4 modems, k=1000; in the 1.44MB diskettes, M=1024000; etc. In 1998 the IEC approved the standard that defines the binary prefixes given above, enabling people to be precise and unambiguous.

Thus, today, MB = 1000000B and MiB = 1048576B.

In the free software world programs are slowly being changed to conform. When the Linux kernel boots and says:

hda: 120064896 sectors (61473 MB) w/2048KiB Cache

the MB are megabytes and the KiB are kibibytes.

Furthermore, to quote the National Institute of Standards and Technology (NIST):

“Once upon a time, computer professionals noticed that 210 was very nearly equal to 1000 and started using the SI prefix “kilo” to mean 1024. That worked well enough for a decade or two because everybody who talked kilobytes knew that the term implied 1024 bytes. But, almost overnight a much more numerous “everybody” bought computers, and the trade computer professionals needed to talk to physicists and engineers and even to ordinary people, most of whom know that a kilometer is 1000 meters and a kilogram is 1000 grams.

“Then data storage for gigabytes, and even terabytes, became practical, and the storage devices were not constructed on binary trees, which meant that, for many practical purposes, binary arithmetic was less convenient than decimal arithmetic. The result is that today “everybody” does not “know” what a megabyte is. When discussing computer memory, most manufacturers use megabyte to mean 220 = 1 048 576 bytes, but the manufacturers of computer storage devices usually use the term to mean 1 000 000 bytes. Some designers of local area networks have used megabit per second to mean 1 048 576 bit/s, but all telecommunications engineers use it to mean 106 bit/s. And if two definitions of the megabyte are not enough, a third megabyte of 1 024 000 bytes is the megabyte used to format the familiar 90 mm (3 1/2 inch), “1.44 MB” diskette. The confusion is real, as is the potential for incompatibility in standards and in implemented systems.

“Faced with this reality, the IEEE Standards Board decided that IEEE standards will use the conventional, internationally adopted, definitions of the SI prefixes. Mega will mean 1 000 000, except that the base-two definition may be used (if such usage is explicitly pointed out on a case-by-case basis) until such time that prefixes for binary multiples are adopted by an appropriate standards body.”

NEWS

bitmath-1.1.0-1

bitmath-1.1.0-1 was published on 2014-12-20.

Changes

Added Functionality

bitmath-1.0.5-1 through 1.0.8-1

bitmath-1.0.8-1 was published on 2014-08-14.

Major Updates

Bug Fixes

Changes

Added Functionality

Project

Tests

  • Test suite is now implemented using Python virtualenv’s for consistency across across platforms
  • Test suite now contains 150 unit tests. This is 110 more tests than the previous major release (1.0.4-1)
  • Test suite now runs on EPEL6 and EPEL7
  • Code coverage is stable around 95-100%

bitmath-1.0.4-1

This is the first release of bitmath. bitmath-1.0.4-1 was published on 2014-03-20.

Project

Available via:

bitmath had been under development for 12 days when the 1.0.4-1 release was made available.

Debut Functionality

  • Converting between SI and NIST prefix units (GiB to kB)
  • Converting between units of the same type (SI to SI, or NIST to NIST)
  • Basic arithmetic operations (subtracting 42KiB from 50GiB)
  • Rich comparison operations (1024 Bytes == 1KiB)
  • Sorting
  • Useful console and print representations

Examples

Arithmetic

In [1]: from bitmath import *

In [2]: log_size = kB(137.4)

In [3]: log_zipped_size = Byte(987)

In [4]: print "Compression saved %s space" % (log_size - log_zipped_size)
Compression saved 136.413kB space

In [5]: thumb_drive = GiB(12)

In [6]: song_size = MiB(5)

In [7]: songs_per_drive = thumb_drive / song_size

In [8]: print songs_per_drive
2457.6

Convert Units

With to_ method

In [1]: from bitmath import *

In [2]: dvd_size = GiB(4.7)

In [3]: print "DVD Size in MiB: %s" % dvd_size.to_MiB()
DVD Size in MiB: 4812.8MiB

With Properties

In [1]: from bitmath import *

In [2]: dvd_size = GiB(4.7)

In [3]: print "DVD Size in MiB: %s" % dvd_size.MiB
DVD Size in MiB: 4812.8MiB

Select a human-readable unit

In [3]: import bitmath

In [4]: small_number = bitmath.kB(100)

In [5]: ugly_number = small_number.TiB

In [6]: print ugly_number
9.09494701773e-08TiB

In [7]: print ugly_number.best_prefix()
97.65625KiB

In [8]: print ugly_number.best_prefix(system=bitmath.SI)
kB(100.0)

Rich Comparison

In [8]: cd_size = MiB(700)

In [9]: cd_size > dvd_size
Out[9]: False

In [10]: cd_size < dvd_size
Out[10]: True

In [11]: MiB(1) == KiB(1024)
Out[11]: True

In [12]: MiB(1) <= KiB(1024)
Out[12]: True

Sorting

In [13]: sizes = [KiB(7337.0), KiB(1441.0), KiB(2126.0), KiB(2178.0),
                  KiB(2326.0), KiB(4003.0), KiB(48.0), KiB(1770.0),
                  KiB(7892.0), KiB(4190.0)]

In [14]: print sorted(sizes)
[KiB(48.0), KiB(1441.0), KiB(1770.0), KiB(2126.0), KiB(2178.0),
KiB(2326.0), KiB(4003.0), KiB(4190.0), KiB(7337.0), KiB(7892.0)]

Custom Formatting

  • Use of the custom formatting system
  • All of the available instance properties

Example:

In [8]: longer_format = """Formatting attributes for %s
   ...: This instances prefix unit is {unit}, which is a {system} type unit
   ...: The unit value is {value}
   ...: This value can be truncated to just 1 digit of precision: {value:.1f}
   ...: This value can be truncated to just 2 significant digits: {value:.2g}
   ...: In binary this looks like: {binary}
   ...: The prefix unit is derived from a base of {base}
   ...: Which is raised to the power {power}
   ...: There are {bytes} bytes in this instance
   ...: The instance is {bits} bits large
   ...: bytes/bits without trailing decimals: {bytes:.0f}/{bits:.0f}""" % str(ugly_number)

In [9]: print ugly_number.format(longer_format)
Formatting attributes for 5.96046447754MiB
This instances prefix unit is MiB, which is a NIST type unit
The unit value is 5.96046447754
This value can be truncated to just 1 digit of precision: 6.0
In binary this looks like: 0b10111110101111000010000000
The prefix unit is derived from a base of 2
Which is raised to the power 20
There are 6250000.0 bytes in this instance
The instance is 50000000.0 bits large
bytes/bits without trailing decimals: 6250000/50000000

Utility Functions

bitmath.getsize()

>>> print bitmath.getsize('python-bitmath.spec')
3.7060546875 KiB

bitmath.parse_string()

>>> import bitmath
>>> a_dvd = bitmath.parse_string("4.7 GiB")
>>> print type(a_dvd)
<class 'bitmath.GiB'>
>>> print a_dvd
4.7 GiB

bitmath.listdir()

Simple listdir example

>>> for i in bitmath.listdir('./tests/', followlinks=True, relpath=True, bestprefix=True):
...     print i
...
('tests/test_file_size.py', KiB(9.2900390625))
('tests/test_basic_math.py', KiB(7.1767578125))
('tests/__init__.py', KiB(1.974609375))
('tests/test_bitwise_operations.py', KiB(2.6376953125))
('tests/test_context_manager.py', KiB(3.7744140625))
('tests/test_representation.py', KiB(5.2568359375))
('tests/test_properties.py', KiB(2.03125))
('tests/test_instantiating.py', KiB(3.4580078125))
('tests/test_future_math.py', KiB(2.2001953125))
('tests/test_best_prefix_BASE.py', KiB(2.1044921875))
('tests/test_rich_comparison.py', KiB(3.9423828125))
('tests/test_best_prefix_NIST.py', KiB(5.431640625))
('tests/test_unique_testcase_names.sh', Byte(311.0))
('tests/.coverage', KiB(3.1708984375))
('tests/test_best_prefix_SI.py', KiB(5.34375))
('tests/test_to_built_in_conversion.py', KiB(1.798828125))
('tests/test_to_Type_conversion.py', KiB(8.0185546875))
('tests/test_sorting.py', KiB(4.2197265625))
('tests/listdir_symlinks/10_byte_file_link', Byte(10.0))
('tests/listdir_symlinks/depth1/depth2/10_byte_file', Byte(10.0))
('tests/listdir_nosymlinks/depth1/depth2/10_byte_file', Byte(10.0))
('tests/listdir_nosymlinks/depth1/depth2/1024_byte_file', KiB(1.0))
('tests/file_sizes/kbytes.test', KiB(1.0))
('tests/file_sizes/bytes.test', Byte(38.0))
('tests/listdir/10_byte_file', Byte(10.0))

listdir example with formatting

>> with bitmath.format(fmt_str="[{value:.3f}@{unit}]"):
...     for i in bitmath.listdir('./tests/', followlinks=True, relpath=True, bestprefix=True):
...         print i[1]
...
[9.290@KiB]
[7.177@KiB]
[1.975@KiB]
[2.638@KiB]
[3.774@KiB]
[5.257@KiB]
[2.031@KiB]
[3.458@KiB]
[2.200@KiB]
[2.104@KiB]
[3.942@KiB]
[5.432@KiB]
[311.000@Byte]
[3.171@KiB]
[5.344@KiB]
[1.799@KiB]
[8.019@KiB]
[4.220@KiB]
[10.000@Byte]
[10.000@Byte]
[10.000@Byte]
[1.000@KiB]
[1.000@KiB]
[38.000@Byte]
[10.000@Byte]