1. The bitmath Module

1.1. Functions

This section describes utility functions included in the bitmath module.

1.1.1. bitmath.getsize()

bitmath.getsize(path[, bestprefix=True[, system=NIST]])

Return a bitmath instance representing the size of a file at any given path.

Parameters:
  • path (string) – The path of a file to read the size of
  • bestprefix (bool) – Default: True, the returned instance will be in the best human-readable prefix unit. If set to False the result is a bitmath.Byte instance.
  • system (One of bitmath.NIST or bitmath.SI) – Default: bitmath.NIST. The preferred system of units for the returned instance.

Internally bitmath.getsize() calls os.path.realpath() before calling os.path.getsize() on any paths.

Here’s an example of where we’ll run bitmath.getsize() on the bitmath source code using the defaults for the bestprefix and system parameters:

>>> import bitmath
>>> print bitmath.getsize('./bitmath/__init__.py')
33.3583984375 KiB

Let’s say we want to see the results in bytes. We can do this by setting bestprefix to False:

>>> import bitmath
>>> print bitmath.getsize('./bitmath/__init__.py', bestprefix=False)
34159.0 Byte

Recall, the default for representation is with the best human-readable prefix. We can control the prefix system used by setting system to either bitmath.NIST (the default) or bitmath.SI:

1
2
3
4
5
6
>>> print bitmath.getsize('./bitmath/__init__.py')
33.3583984375 KiB
>>> print bitmath.getsize('./bitmath/__init__.py', system=bitmath.NIST)
33.3583984375 KiB
>>> print bitmath.getsize('./bitmath/__init__.py', system=bitmath.SI)
34.159 kB

We can see in lines 14 that the same result is returned when system is not set and when system is set to bitmath.NIST (the default).

New in version 1.0.7.

1.1.2. bitmath.listdir()

bitmath.listdir(search_base[, followlinks=False[, filter='*'[, relpath=False[, bestprefix=False[, system=NIST]]]]])

This is a generator which recurses a directory tree yielding 2-tuples of:

  • The absolute/relative path to a discovered file
  • A bitmath instance representing the apparent size of the file
Parameters:
  • search_base (string) – The directory to begin walking down
  • followlinks (bool) – Default: False, do not follow links. Whether or not to follow symbolic links to directories. Setting to True enables directory link following
  • filter (string) – Default: * (everything). A glob to filter results with. See fnmatch for more details about globs
  • relpath (bool) – Default: False, returns the fully qualified to each discovered file. True to return the relative path from the present working directory to the discovered file. If relpath is False, then bitmath.listdir() internally calls os.path.realpath() to normalize path references
  • bestprefix (bool) – Default: False, returns bitmath.Byte instances. Set to True to return the best human-readable prefix unit for representation
  • system (One of bitmath.NIST or bitmath.SI) – Default: bitmath.NIST. Set a prefix preferred unit system. Requires bestprefix is True

Note

  • This function does not return tuples for directory entities. Including directories in results is scheduled for introduction in the upcoming 1.1.0 release.
  • Symlinks to files are followed automatically

When interpreting the results from this function it is crucial to understand exactly which items are being taken into account, what decisions were made to select those items, and how their sizes are measured.

Results from this function may seem invalid when directly compared to the results from common command line utilities, such as du, or tree.

Let’s pretend we have a directory structure like the following:

some_files/
├── deeper_files/
│   └── second_file
└── first_file

Where some_files/ is a directory, and so is some_files/deeper_files/. There are two regular files in this tree:

  • somefiles/first_file - 1337 Bytes
  • some_files/deeper_files/second_file - 13370 Bytes

The total size of the files in this tree is 1337 + 13370 = 14707 bytes.

Let’s call bitmath.listdir() on the some_files/ directory and see what the results look like. First we’ll use all the default parameters, then we’ll set relpath to True:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
>>> import bitmath
>>> for f in bitmath.listdir('./some_files'):
...     print f
...
('/tmp/tmp.P5lqtyqwPh/some_files/first_file', Byte(1337.0))
('/tmp/tmp.P5lqtyqwPh/some_files/deeper_files/second_file', Byte(13370.0))
>>> for f in bitmath.listdir('./some_files', relpath=True):
...     print f
...
('some_files/first_file', Byte(1337.0))
('some_files/deeper_files/second_file', Byte(13370.0))

On lines 5 and 6 the results print the full path, whereas on lines 10 and 11 the path is relative to the present working directory.

Let’s play with the filter parameter now. Let’s say we only want to include results for files whose name begins with “second”:

>>> for f in bitmath.listdir('./some_files', filter='second*'):
...     print f
...
('/tmp/tmp.P5lqtyqwPh/some_files/deeper_files/second_file', Byte(13370.0))

If we wish to avoid having to write for-loops, we can collect the results into a list rather simply:

>>> files = list(bitmath.listdir('./some_files'))
>>> print files
[('/tmp/tmp.P5lqtyqwPh/some_files/first_file', Byte(1337.0)), ('/tmp/tmp.P5lqtyqwPh/some_files/deeper_files/second_file', Byte(13370.0))]

Here’s a more advanced example where we will sum the size of all the returned results and then play around with the possible formatting. Recall that a bitmath instance representing the size of the discovered file is the second item in each returned tuple.

>>> discovered_files = [f[1] for f in bitmath.listdir('./some_files')]
>>> print discovered_files
[Byte(1337.0), Byte(13370.0)]
>>> print reduce(lambda x,y: x+y, discovered_files)
14707.0 Byte
>>> print reduce(lambda x,y: x+y, discovered_files).best_prefix()
14.3623046875 KiB
>>> print reduce(lambda x,y: x+y, discovered_files).best_prefix().format("{value:.3f} {unit}")
14.362 KiB

New in version 1.0.7.

1.1.3. bitmath.parse_string()

bitmath.parse_string(str_repr)

New in version 1.1.0.

Parse a string representing a unit into a proper bitmath object. All non-string inputs are rejected and will raise a ValueError. Strings without units are also rejected. See the examples below for additional clarity.

Parameters:str_repr (string) – The string to parse. May contain whitespace between the value and the unit.
Returns:A bitmath object representing str_repr
Raises ValueError:
 if str_repr can not be parsed

A simple usage example:

>>> import bitmath
>>> a_dvd = bitmath.parse_string("4.7 GiB")
>>> print type(a_dvd)
<class 'bitmath.GiB'>
>>> print a_dvd
4.7 GiB

Caution

Caution is advised if you are reading values from an unverified external source, such as output from a shell command or a generated file. Many applications (even /usr/bin/ls) still do not produce file size strings with valid (or even correct) prefix units unless specially configured to do so.

To protect your application from unexpected runtime errors it is recommended that calls to bitmath.parse_string() are wrapped in a try statement:

>>> import bitmath
>>> try:
...     a_dvd = bitmath.parse_string("4.7 G")
... except ValueError:
...    print "Error while parsing string into bitmath object"
...
Error while parsing string into bitmath object

Here we can see some more examples of invalid input, as well as two acceptable inputs:

>>> import bitmath
>>> sizes = [ 1337, 1337.7, "1337", "1337.7", "1337 B", "1337B" ]
>>> for size in sizes:
...     try:
...         print "Parsed size into %s" % bitmath.parse_string(size).best_prefix()
...     except ValueError:
...         print "Could not parse input: %s" % size
...
Could not parse input: 1337
Could not parse input: 1337.7
Could not parse input: 1337
Could not parse input: 1337.7
Parsed size into 1.3056640625 KiB
Parsed size into 1.3056640625 KiB

Changed in version 1.2.4: Added support for parsing octet units via issue #53 - parse french units. The usage of “octet” is still common in some RFCs, as well as France, French Canada and Romania. See also, a table of the octet units and their values on Wikipedia.

Here are some simple examples of parsing octet based units:

1
2
3
4
5
6
7
8
9
>>> import bitmath
>>> a_mebibyte = bitmath.parse_string("1 MiB")
>>> a_mebioctet = bitmath.parse_string("1 Mio")
>>> print a_mebibyte, a_mebioctet
1.0 MiB 1.0 MiB
>>> print bitmath.parse_string("1Po")
1.0 PB
>>> print bitmath.parse_string("1337 Eio")
1337.0 EiB

Notice how on lines 4 and 5 that the variable a_mebibyte from the input 1 MiB is exactly equivalent to a_mebioctet from the different input 1 Mio. This is because after bitmath parses the octet units the results are normalized into their standard NIST/SI equivalents automatically.

1.1.4. bitmath.query_device_capacity()

bitmath.query_device_capacity(device_fd)

Create bitmath.Byte instances representing the capacity of a block device.

Parameters:

device_fd (file) – An open file handle (handle = open('/dev/sda')) of the target device.

Returns:

A bitmath.Byte equal to the size of device_fd

Raises:
  • ValueError – if file descriptor device_fd is not of a device type
  • IOError
    • IOError[13] - If the effective uid of this process does not have access to issue raw commands to block devices. I.e., this process does not have super-user rights
    • IOError[2] - If the device device_fd points to does not exist

Important

Superuser (root/admin) privileges are required to allow bitmath.query_device_capacity() to make the low-level system calls to read a devices capacity. Use of this function on a device the user does not have access to will result in run-time errors.

Examples of supported devices include:

  • Standard Hard Drives/External Drives
  • Filesystem Partitions
  • Loop Devices
  • LVM Logical Volumes
  • Encrypted LUKS Volumes
  • iSCSI Devices

Here’s an example using the with context manager to open a device and print its capacity with the best-human readable prefix (line 3):

1
2
3
4
5
>>> import bitmath
>>> with open("/dev/sda") as device:
...     size = bitmath.query_device_capacity(device).best_prefix()
...     print "Device %s capacity: %s (%s Bytes)" % (device.name, size, size_bytes)
Device /dev/sda capacity: 238.474937439 GiB (2.56060514304e+11 Bytes)

Important

Platform Notice: bitmath.query_device_capacity() is only verified to work on Linux and Mac OS X platforms. To file a bug report, please follow the instructions in the contributing section.

New in version 1.2.4.

1.2. Context Managers

This section describes all of the context managers provided by the bitmath class.

Note

For a bit of background, a context manager (specifically, the with statement) is a feature of the Python language which is commonly used to:

  • Decorate, or wrap, an arbitrary block of code. I.e., effect a certain condition onto a specific body of code
  • Automatically open and close an object which is used in a specific context. I.e., handle set-up and tear-down of objects in the place they are used.

See also

PEP 343
The “with” Statement
PEP 318
Decorators for Functions and Methods

1.2.1. bitmath.format()

bitmath.format([fmt_str=None[, plural=False[, bestprefix=False]]])

The bitmath.format() context manager allows you to specify the string representation of all bitmath instances within a specific block of code.

This is effectively equivalent to applying the format() method to an entire region of code.

Parameters:
  • fmt_str (str) – a formatting mini-language compat formatting string. See the instance attributes for a list of available items.
  • plural (bool) – True enables printing instances with trailing s‘s if they’re plural. False (default) prints them as singular (no trailing ‘s’)
  • bestprefix (bool) – True enables printing instances in their best human-readable representation. False, the default, prints instances using their current prefix unit.

Note

The bestprefix parameter is not yet implemented!

Let’s look at an example of toggling pluralization on and off. First we’ll look over a demonstration script (below), and then we’ll review the output.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import bitmath

a_single_bit = bitmath.Bit(1)
technically_plural_bytes = bitmath.Byte(0)
always_plural_kbs = bitmath.kb(42)

formatting_args = {
    'not_plural': a_single_bit,
    'technically_plural': technically_plural_bytes,
    'always_plural': always_plural_kbs
}

print """None of the following will be pluralized, because that feature is turned off
"""

test_string = """   One unit of 'Bit': {not_plural}

   0 of a unit is typically said pluralized in US English: {technically_plural}

   several items of a unit will always be pluralized in normal US English
   speech: {always_plural}"""

print test_string.format(**formatting_args)

print """
----------------------------------------------------------------------
"""

print """Now, we'll use the bitmath.format() context manager
to print the same test string, but with pluralization enabled.
"""

with bitmath.format(plural=True):
    print test_string.format(**formatting_args)

The context manager is demonstrated in lines 3334. In these lines we use the bitmath.format() context manager, setting plural to True, to print the original string again. By doing this we have enabled pluralized string representations (where appropriate). Running this script would have the following output:

None of the following will be pluralized, because that feature is turned off

   One unit of 'Bit': 1.0 Bit

   0 of a unit is typically said pluralized in US English: 0.0 Byte

   several items of a unit will always be pluralized in normal US English
   speech: 42.0 kb

----------------------------------------------------------------------

Now, we'll use the bitmath.format() context manager
to print the same test string, but with pluralization enabled.

   One unit of 'Bit': 1.0 Bit

   0 of a unit is typically said pluralized in US English: 0.0 Bytes

   several items of a unit will always be pluralized in normal US English
   speech: 42.0 kbs

Here’s a shorter example, where we’ll:

  • Print a string containing bitmath instances using the default formatting (lines 23)
  • Use the context manager to print the instances in scientific notation (lines 47)
  • Print the string one last time to demonstrate how the formatting automatically returns to the default format (lines 89)
1
2
3
4
5
6
7
8
9
>>> import bitmath
>>> print "Some instances: %s, %s" % (bitmath.KiB(1 / 3.0), bitmath.Bit(512))
Some instances: 0.333333333333 KiB, 512.0 Bit
>>> with bitmath.format("{value:e}-{unit}"):
...     print "Some instances: %s, %s" % (bitmath.KiB(1 / 3.0), bitmath.Bit(512))
...
Some instances: 3.333333e-01-KiB, 5.120000e+02-Bit
>>> print "Some instances: %s, %s" % (bitmath.KiB(1 / 3.0), bitmath.Bit(512))
Some instances: 0.333333333333 KiB, 512.0 Bit

New in version 1.0.8.

1.3. Module Variables

This section describes the module-level variables. Some of which are constants and are used for reference. Some of which effect output or behavior.

Changed in version 1.0.7: The formatting strings were not available for manupulate/inspection in earlier versions

New in version 1.1.1: Prior to this version ALL_UNIT_TYPES was not defined

Note

Modifying these variables will change the default representation indefinitely. Use the bitmath.format() context manager to limit changes to a specific block of code.

bitmath.format_string

This is the default string representation of all bitmath instances. The default value is {value} {unit} which, when evaluated, formats an instance as a floating point number with at least one digit of precision, followed by a character of whitespace, followed by the prefix unit of the instance.

For example, given bitmath instances representing the following values: 1337 MiB, 0.1234567 kb, and 0 B, their printed output would look like the following:

>>> from bitmath import *
>>> print MiB(1337), kb(0.1234567), Byte(0)
1337.0 MiB 0.1234567 kb 0.0 Byte

We can make these instances print however we want to. Let’s wrap each one in square brackets ([, ]), replace the separating space character with a hyphen (-), and limit the precision to just 2 digits:

>>> import bitmath
>>> bitmath.format_string = "[{value:.2f}-{unit}]"
>>> print bitmath.MiB(1337), bitmath.kb(0.1234567), bitmath.Byte(0)
[1337.00-MiB] [0.12-kb] [0.00-Byte]
bitmath.format_plural

A boolean which controls the pluralization of instances in string representation. The default is False.

If we wanted to enable pluralization we could set the format_plural variable to True. First, let’s look at some output using the default singular formatting.

>>> import bitmath
>>> print bitmath.MiB(1337)
1337.0 MiB

And now we’ll enable pluralization (line 2):

1
2
3
4
5
6
7
>>> import bitmath
>>> bitmath.format_plural = True
>>> print bitmath.MiB(1337)
1337.0 MiBs
>>> bitmath.format_plural = False
>>> print bitmath.MiB(1337)
1337.0 MiB

On line 5 we disable pluralization again and then see that the output has no trailing “s” character.

bitmath.NIST

Constant used as an argument to some functions to specify the NIST system.

bitmath.SI

Constant used as an argument to some functions to specify the SI system.

bitmath.SI_PREFIXES

An array of all of the SI unit prefixes (e.g., k, M, or E)

bitmath.SI_STEPS
SI_STEPS = {
    'Bit': 1 / 8.0,
    'Byte': 1,
    'k': 1000,
    'M': 1000000,
    'G': 1000000000,
    'T': 1000000000000,
    'P': 1000000000000000,
    'E': 1000000000000000000
}
bitmath.NIST_PREFIXES

An array of all of the NIST unit prefixes (e.g., Ki, Mi, or Ei)

bitmath.NIST_STEPS
NIST_STEPS = {
    'Bit': 1 / 8.0,
    'Byte': 1,
    'Ki': 1024,
    'Mi': 1048576,
    'Gi': 1073741824,
    'Ti': 1099511627776,
    'Pi': 1125899906842624,
    'Ei': 1152921504606846976
}
bitmath.ALL_UNIT_TYPES

An array of all combinations of known valid prefix units mixed with both bit and byte suffixes.

ALL_UNIT_TYPES = ['b', 'B', 'kb', 'kB', 'Mb', 'MB', 'Gb', 'GB',
   'Tb', 'TB', 'Pb', 'PB', 'Eb', 'EB', 'Kib', 'KiB', 'Mib',
   'MiB', 'Gib', 'GiB', 'Tib', 'TiB', 'Pib', 'PiB', 'Eib',
   'EiB']

1.4. 3rd Party Module Integrations

This section describes the various ways in which bitmath can be integrated with other 3rd pary modules.

To see a full demo of the argparse and progressbar integrations, as well as a comprehensive demonstrations of the full capabilities of the bitmath library, see Creating Download Progress Bars in the Real Life Examples section.

1.4.1. argparse

New in version 1.2.0.

The argparse module (part of stdlib) is used to parse command line arguments. By default, parsed options and arguments are turned into strings. However, one useful feature argparse provides is the ability to specify what datatype any given argument or option should be interpreted as.

bitmath.integrations.BitmathType(bmstring)

The BitmathType() factory creates objects that can be passed to the type argument of ArgumentParser.add_argument(). Arguments that have BitmathType() objects as their type will automatically parse the command line argument into a matching bitmath object.

Parameters:

bmstring (str) – The command-line option to parse into a bitmath object

Returns:

A bitmath object representing bmstring

Raises:
  • ValueError – on any input that bitmath.parse_string() already rejects
  • ValueError – on unquoted inputs with whitespace separating the value from the unit (e.g., --some-option 10 MiB is bad, but --some-option '10 MiB' is good)

Let’s take a look at a more in-depth example.

A feature found in many command-line utilities is the ability to specify some kind of file size using a string which roughly describes some kind of parameter. For example, let’s look at the du (disk usage) command. Invoking it as du -B allows one to specify a desired block-size scaling factor in printed results.

Let’s say we wanted to implement a similar mechanism in an application of our own. Except, instead of abbreviating down to ambiguous capital letters, we accept scaling factors as properly written values with associated units. Such as 10 MiB, or 1 MB.

To accomplish this, we’ll use argparse to create an argument parser and add one option to it, --block-size. This option will have a type of BitmathType() set.

1
2
3
4
5
6
7
>>> import argparse, bitmath
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--block-size', type=bitmath.BitmathType)
>>> args = "--block-size 1MiB"
>>> results = parser.parse_args(args.split())
>>> print type(results.block_size)
<class 'bitmath.MiB'>

On line 3 we add the --block-size option to the parser, explicitly defining it’s type as BitmathType(). In lines 6 and 7 when we parse the provided arguments we find that argparse has automatically created a bitmath object for us.

If an invalid scaling factor is provided by the user, such as one which does not represent a recognizable unit, the bitmath library will automatically detect this for us and signal to the argument parser that an error has occurred.

1.4.2. progressbar

New in version 1.2.1.

The progressbar module is typically used to display the progress of a long running task, such as a file transfer operation. The module provides widgets for custom formatting how exactly the ‘progress’ is displayed. Some examples include: overall percentage complete, estimated time until completion, and an ASCII progress bar which fills as the operation continues.

While progressbar already includes a widget suitable for displaying file transfer rates, this widget does not support customizing its presentation, and is limited to only prefix units from the SI system.

class bitmath.integrations.BitmathFileTransferSpeed([system=bitmath.NIST[, format="{value:.2f} {unit}/s"]])

The BitmathFileTransferSpeed class is a more functional replacement for the upstream FileTransferSpeed widget.

While both widgets are able to calculate average transfer rates over a period of time, the BitmathFileTransferSpeed widget adds new support for NIST prefix units (the upstream widget only supports SI prefix units).

In addition to NIST unit support, BitmathFileTransferSpeed enables the user to have full control over the look and feel of the displayed rates.

Parameters:
  • system (One of bitmath.NIST or bitmath.SI) – Default: bitmath.NIST. The preferred system of units for the printed rate.
  • format (string) – a formatting mini-language compat formatting string. Default {value:.2f} {unit}/s (e.g., 13.37 GiB/s)

Note

See instance attributes for a list of available formatting items. See the section on formatting bitmath instances for more information on this topic.

Use BitmathFileTransferSpeed exactly like the upstream FileTransferSpeed widget (example copied and modified from the progressbar project page):

1
2
3
4
5
6
7
8
9
>>> from progressbar import ProgressBar, Percentage, Bar, ETA, RotatingMarker
>>> from bitmath.integrations import BitmathFileTransferSpeed
>>> widgets = ['Something: ', Percentage(), ' ', Bar(marker=RotatingMarker()),
...           ' ', ETA(), ' ', BitmathFileTransferSpeed()]
>>> pbar = ProgressBar(widgets=widgets, maxval=10000000).start()
>>> for i in range(1000000):
...     # do something
...     pbar.update(10*i+1)
>>> pbar.finish()

If this was ran from a script we would see output similar to the following:

Something: 100% ||||||||||||||||||||||||||||||||||| Time: 0:00:01 9.27 MiB/s

If we wanted behavior identical to FileTransferSpeed we would set the system parameter to bitmath.SI (line 5 below):

1
2
3
4
5
6
7
>>> import bitmath
>>> # ...
>>> widgets = ['Something: ', Percentage(), ' ', Bar(marker=RotatingMarker()),
...           ' ', ETA(), ' ',
...           BitmathFileTransferSpeed(system=bitmath.SI)]
>>> pbar = ProgressBar(widgets=widgets, maxval=10000000).start()
>>> # ...

If this was ran from a script we would see output similar to the following:

Something: 100% ||||||||||||||||||||||||||||||||||| Time: 0:00:01 9.80 MB/s

Note how the only difference is in the displayed unit. The former example produced a rate with a unit of MiB (a NIST unit) whereas the latter examples unit is MB (an SI unit).

As noted previously, BitmathFileTransferSpeed allows for full control over the formatting of the calculated rate of transfer.

For example, if we wished to see the rate printed using more verbose language and plauralized units, we could do exactly that by constructing our widget in the following way:

BitmathFileTransferSpeed(format="{value:.2f} {unit_plural} per second")

And if this were run from a script like the previous examples:

Something: 100% ||||||||||||||||||||||||||||||||||| Time: 0:00:01 9.41 MiBs per second