1. The bitmath Module

1.1. Functions

This section describes utility functions included in the bitmath module.

1.1.1. bitmath.getsize()

bitmath.getsize(path[, bestprefix=True[, system=NIST]])

Return a bitmath instance representing the size of a file at any given path.

Parameters:
  • path (string) – The path of a file to read the size of
  • bestprefix (bool) – Default: True, the returned instance will be in the best human-readable prefix unit. If set to False the result is a bitmath.Byte instance.
  • system (One of bitmath.NIST or bitmath.SI) – Default: bitmath.NIST. The preferred system of units for the returned instance.

Internally bitmath.getsize() calls os.path.realpath() before calling os.path.getsize() on any paths.

Here’s an example of where we’ll run bitmath.getsize() on the bitmath source code using the defaults for the bestprefix and system parameters:

>>> import bitmath
>>> print bitmath.getsize('./bitmath/__init__.py')
33.3583984375 KiB

Let’s say we want to see the results in bytes. We can do this by setting bestprefix to False:

>>> import bitmath
>>> print bitmath.getsize('./bitmath/__init__.py', bestprefix=False)
34159.0 Byte

Recall, the default for representation is with the best human-readable prefix. We can control the prefix system used by setting system to either bitmath.NIST (the default) or bitmath.SI:

1
2
3
4
5
6
>>> print bitmath.getsize('./bitmath/__init__.py')
33.3583984375 KiB
>>> print bitmath.getsize('./bitmath/__init__.py', system=bitmath.NIST)
33.3583984375 KiB
>>> print bitmath.getsize('./bitmath/__init__.py', system=bitmath.SI)
34.159 kB

We can see in lines 14 that the same result is returned when system is not set and when system is set to bitmath.NIST (the default).

New in version 1.0.7.

1.1.2. bitmath.listdir()

bitmath.listdir(search_base[, followlinks=False[, filter='*'[, relpath=False[, bestprefix=False[, system=NIST]]]]])

This is a generator which recurses a directory tree yielding 2-tuples of:

  • The absolute/relative path to a discovered file
  • A bitmath instance representing the apparent size of the file
Parameters:
  • search_base (string) – The directory to begin walking down
  • followlinks (bool) – Default: False, do not follow links. Whether or not to follow symbolic links to directories. Setting to True enables directory link following
  • filter (string) – Default: * (everything). A glob to filter results with. See fnmatch for more details about globs
  • relpath (bool) – Default: False, returns the fully qualified to each discovered file. True to return the relative path from the present working directory to the discovered file. If relpath is False, then bitmath.listdir() internally calls os.path.realpath() to normalize path references
  • bestprefix (bool) – Default: False, returns bitmath.Byte instances. Set to True to return the best human-readable prefix unit for representation
  • system (One of bitmath.NIST or bitmath.SI) – Default: bitmath.NIST. Set a prefix preferred unit system. Requires bestprefix is True

Note

  • This function does not return tuples for directory entities. Including directories in results is scheduled for introduction in the upcoming 1.1.0 release.
  • Symlinks to files are followed automatically

When interpreting the results from this function it is crucial to understand exactly which items are being taken into account, what decisions were made to select those items, and how their sizes are measured.

Results from this function may seem invalid when directly compared to the results from common command line utilities, such as du, or tree.

Let’s pretend we have a directory structure like the following:

some_files/
├── deeper_files/
│   └── second_file
└── first_file

Where some_files/ is a directory, and so is some_files/deeper_files/. There are two regular files in this tree:

  • somefiles/first_file - 1337 Bytes
  • some_files/deeper_files/second_file - 13370 Bytes

The total size of the files in this tree is 1337 + 13370 = 14707 bytes.

Let’s call bitmath.listdir() on the some_files/ directory and see what the results look like. First we’ll use all the default parameters, then we’ll set relpath to True:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
>>> import bitmath
>>> for f in bitmath.listdir('./some_files'):
...     print f
...
('/tmp/tmp.P5lqtyqwPh/some_files/first_file', Byte(1337.0))
('/tmp/tmp.P5lqtyqwPh/some_files/deeper_files/second_file', Byte(13370.0))
>>> for f in bitmath.listdir('./some_files', relpath=True):
...     print f
...
('some_files/first_file', Byte(1337.0))
('some_files/deeper_files/second_file', Byte(13370.0))

On lines 5 and 6 the results print the full path, whereas on lines 10 and 11 the path is relative to the present working directory.

Let’s play with the filter parameter now. Let’s say we only want to include results for files whose name begins with “second”:

>>> for f in bitmath.listdir('./some_files', filter='second*'):
...     print f
...
('/tmp/tmp.P5lqtyqwPh/some_files/deeper_files/second_file', Byte(13370.0))

If we wish to avoid having to write for-loops, we can collect the results into a list rather simply:

>>> files = list(bitmath.listdir('./some_files'))
>>> print files
[('/tmp/tmp.P5lqtyqwPh/some_files/first_file', Byte(1337.0)), ('/tmp/tmp.P5lqtyqwPh/some_files/deeper_files/second_file', Byte(13370.0))]

Here’s a more advanced example where we will sum the size of all the returned results and then play around with the possible formatting. Recall that a bitmath instance representing the size of the discovered file is the second item in each returned tuple.

>>> discovered_files = [f[1] for f in bitmath.listdir('./some_files')]
>>> print discovered_files
[Byte(1337.0), Byte(13370.0)]
>>> print reduce(lambda x,y: x+y, discovered_files)
14707.0 Byte
>>> print reduce(lambda x,y: x+y, discovered_files).best_prefix()
14.3623046875 KiB
>>> print reduce(lambda x,y: x+y, discovered_files).best_prefix().format("{value:.3f} {unit}")
14.362 KiB

New in version 1.0.7.

1.1.3. bitmath.parse_string()

bitmath.parse_string(str_repr)

Parse a string representing a unit into a proper bitmath object. All non-string inputs are rejected and will raise a ValueError. Strings without units are also rejected. See the examples below for additional clarity.

Parameters:str_repr (string) – The string to parse. May contain whitespace between the value and the unit.
Returns:A bitmath object representing str_repr
Raises ValueError:
 if str_repr can not be parsed

A simple usage example:

>>> import bitmath
>>> a_dvd = bitmath.parse_string("4.7 GiB")
>>> print type(a_dvd)
<class 'bitmath.GiB'>
>>> print a_dvd
4.7 GiB

Caution

Caution is advised if you are reading values from an unverified external source, such as output from a shell command or a generated file. Many applications (even /usr/bin/ls) still do not produce file size strings with valid (or even correct) prefix units unless specially configured to do so.

To protect your application from unexpected runtime errors it is recommended that calls to bitmath.parse_string() are wrapped in a try statement:

>>> import bitmath
>>> try:
...     a_dvd = bitmath.parse_string("4.7 G")
... except ValueError:
...    print "Error while parsing string into bitmath object"
...
Error while parsing string into bitmath object

Here we can see some more examples of invalid input, as well as two acceptable inputs:

>>> import bitmath
>>> sizes = [ 1337, 1337.7, "1337", "1337.7", "1337 B", "1337B" ]
>>> for size in sizes:
...     try:
...         print "Parsed size into %s" % bitmath.parse_string(size).best_prefix()
...     except ValueError:
...         print "Could not parse input: %s" % size
...
Could not parse input: 1337
Could not parse input: 1337.7
Could not parse input: 1337
Could not parse input: 1337.7
Parsed size into 1.3056640625 KiB
Parsed size into 1.3056640625 KiB

New in version 1.1.0.

1.2. Context Managers

This section describes all of the context managers provided by the bitmath class.

Note

For a bit of background, a context manager (specifically, the with statement) is a feature of the Python language which is commonly used to:

  • Decorate, or wrap, an arbitrary block of code. I.e., effect a certain condition onto a specific body of code
  • Automatically open and close an object which is used in a specific context. I.e., handle set-up and tear-down of objects in the place they are used.

See also

PEP 343
The “with” Statement
PEP 318
Decorators for Functions and Methods

1.2.1. bitmath.format()

bitmath.format([fmt_str=None[, plural=False[, bestprefix=False]]])

The bitmath.format() context manager allows you to specify the string representation of all bitmath instances within a specific block of code.

This is effectively equivalent to applying the format() method to an entire region of code.

Parameters:
  • fmt_str (str) – a formatting mini-language compat formatting string. See the instances attributes for a list of available items.
  • plural (bool) – True enables printing instances with trailing s‘s if they’re plural. False (default) prints them as singular (no trailing ‘s’)
  • bestprefix (bool) – True enables printing instances in their best human-readable representation. False, the default, prints instances using their current prefix unit.

Note

The bestprefix parameter is not yet implemented!

Let’s look at an example of toggling pluralization on and off. First we’ll look over a demonstration script (below), and then we’ll review the output.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import bitmath

a_single_bit = bitmath.Bit(1)
technically_plural_bytes = bitmath.Byte(0)
always_plural_kbs = bitmath.kb(42)

formatting_args = {
    'not_plural': a_single_bit,
    'technically_plural': technically_plural_bytes,
    'always_plural': always_plural_kbs
}

print """None of the following will be pluralized, because that feature is turned off
"""

test_string = """   One unit of 'Bit': {not_plural}

   0 of a unit is typically said pluralized in US English: {technically_plural}

   several items of a unit will always be pluralized in normal US English
   speech: {always_plural}"""

print test_string.format(**formatting_args)

print """
----------------------------------------------------------------------
"""

print """Now, we'll use the bitmath.format() context manager
to print the same test string, but with pluralization enabled.
"""

with bitmath.format(plural=True):
    print test_string.format(**formatting_args)

The context manager is demonstrated in lines 3334. In these lines we use the bitmath.format() context manager, setting plural to True, to print the original string again. By doing this we have enabled pluralized string representations (where appropriate). Running this script would have the following output:

None of the following will be pluralized, because that feature is turned off

   One unit of 'Bit': 1.0 Bit

   0 of a unit is typically said pluralized in US English: 0.0 Byte

   several items of a unit will always be pluralized in normal US English
   speech: 42.0 kb

----------------------------------------------------------------------

Now, we'll use the bitmath.format() context manager
to print the same test string, but with pluralization enabled.

   One unit of 'Bit': 1.0 Bit

   0 of a unit is typically said pluralized in US English: 0.0 Bytes

   several items of a unit will always be pluralized in normal US English
   speech: 42.0 kbs

Here’s a shorter example, where we’ll:

  • Print a string containing bitmath instances using the default formatting (lines 23)
  • Use the context manager to print the instances in scientific notation (lines 47)
  • Print the string one last time to demonstrate how the formatting automatically returns to the default format (lines 89)
1
2
3
4
5
6
7
8
9
>>> import bitmath
>>> print "Some instances: %s, %s" % (bitmath.KiB(1 / 3.0), bitmath.Bit(512))
Some instances: 0.333333333333 KiB, 512.0 Bit
>>> with bitmath.format("{value:e}-{unit}"):
...     print "Some instances: %s, %s" % (bitmath.KiB(1 / 3.0), bitmath.Bit(512))
...
Some instances: 3.333333e-01-KiB, 5.120000e+02-Bit
>>> print "Some instances: %s, %s" % (bitmath.KiB(1 / 3.0), bitmath.Bit(512))
Some instances: 0.333333333333 KiB, 512.0 Bit

New in version 1.0.8.

1.3. 3rd Party Module Integrations

This section describes the various ways in which bitmath can be integrated with other 3rd pary modules.

1.3.1. argparse

New in version 1.1.1.

The argparse module (part of stdlib) is used to parse command line arguments. By default, parsed options and arguments are turned into strings. However, one useful feature argparse provides is the ability to specify what datatype any given argument or option should be interpreted as.

bitmath.BitmathType(bmstring)

The BitmathType() factory creates objects that can be passed to the type argument of ArgumentParser.add_argument(). Arguments that have BitmathType() objects as their type will automatically parse the command line argument into a matching bitmath object.

Parameters:

bmstring (str) – The command-line option to parse into a bitmath object

Returns:

A bitmath object representing bmstring

Raises:
  • ValueError – on any input that bitmath.parse_string() already rejects
  • ValueError – on unquoted inputs with whitespace separating the value from the unit (e.g., --some-option 10 MiB is bad, but --some-option '10 MiB' is good)

Let’s take a look at a more in-depth example.

A feature found in many command-line utilities is the ability to specify some kind of file size using a string which roughly describes some kind of parameter. For example, let’s look at the du (disk usage) command. Invoking it as du -B allows one to specify a desired block-size scaling factor in printed results.

Let’s say we wanted to implement a similar mechanism in an application of our own. Except, instead of abbreviating down to ambiguous capital letters, we accept scaling factors as properly written values with associated units. Such as 10 MiB, or 1 MB.

To accomplish this, we’ll use argparse to create an argument parser and add one option to it, --block-size. This option will have a type of BitmathType() set.

1
2
3
4
5
6
7
>>> import argparse, bitmath
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--block-size', type=bitmath.BitmathType)
>>> args = "--block-size 1MiB"
>>> results = parser.parse_args(args.split())
>>> print type(results.block_size)
<class 'bitmath.MiB'>

On line 3 we add the --block-size option to the parser, explicitly defining it’s type as BitmathType(). In lines 6 and 7 when we parse the provided arguments we find that argparse has automatically created a bitmath object for us.

If an invalid scaling factor is provided by the user, such as one which does not represent a recognizable unit, the bitmath library will automatically detect this for us and signal to the argument parser that an error has occurred.

1.4. Module Variables

This section describes the module-level variables. Some of which are constants and are used for reference. Some of which effect output or behavior.

Changed in version 1.0.7: The formatting strings were not available for manupulate/inspection in earlier versions

New in version 1.1.1: Prior to this version ALL_UNIT_TYPES was not defined

Note

Modifying these variables will change the default representation indefinitely. Use the bitmath.format() context manager to limit changes to a specific block of code.

bitmath.format_string

This is the default string representation of all bitmath instances. The default value is {value} {unit} which, when evaluated, formats an instance as a floating point number with at least one digit of precision, followed by a character of whitespace, followed by the prefix unit of the instance.

For example, given bitmath instances representing the following values: 1337 MiB, 0.1234567 kb, and 0 B, their printed output would look like the following:

>>> from bitmath import *
>>> print MiB(1337), kb(0.1234567), Byte(0)
1337.0 MiB 0.1234567 kb 0.0 Byte

We can make these instances print however we want to. Let’s wrap each one in square brackets ([, ]), replace the separating space character with a hyphen (-), and limit the precision to just 2 digits:

>>> import bitmath
>>> bitmath.format_string = "[{value:.2f}-{unit}]"
>>> print bitmath.MiB(1337), bitmath.kb(0.1234567), bitmath.Byte(0)
[1337.00-MiB] [0.12-kb] [0.00-Byte]
bitmath.format_plural

A boolean which controls the pluralization of instances in string representation. The default is False.

If we wanted to enable pluralization we could set the format_plural variable to True. First, let’s look at some output using the default singular formatting.

>>> import bitmath
>>> print bitmath.MiB(1337)
1337.0 MiB

And now we’ll enable pluralization (line 2):

1
2
3
4
5
6
7
>>> import bitmath
>>> bitmath.format_plural = True
>>> print bitmath.MiB(1337)
1337.0 MiBs
>>> bitmath.format_plural = False
>>> print bitmath.MiB(1337)
1337.0 MiB

On line 5 we disable pluralization again and then see that the output has no trailing “s” character.

bitmath.NIST

Constant used as an argument to some functions to specify the NIST system.

bitmath.SI

Constant used as an argument to some functions to specify the SI system.

bitmath.SI_PREFIXES

An array of all of the SI unit prefixes (e.g., k, M, or E)

bitmath.SI_STEPS
SI_STEPS = {
    'Bit': 1 / 8.0,
    'Byte': 1,
    'k': 1000,
    'M': 1000000,
    'G': 1000000000,
    'T': 1000000000000,
    'P': 1000000000000000,
    'E': 1000000000000000000
}
bitmath.NIST_PREFIXES

An array of all of the NIST unit prefixes (e.g., Ki, Mi, or Ei)

bitmath.NIST_STEPS
NIST_STEPS = {
    'Bit': 1 / 8.0,
    'Byte': 1,
    'Ki': 1024,
    'Mi': 1048576,
    'Gi': 1073741824,
    'Ti': 1099511627776,
    'Pi': 1125899906842624,
    'Ei': 1152921504606846976
}
bitmath.ALL_UNIT_TYPES

An array of all combinations of known valid prefix units mixed with both bit and byte suffixes.

ALL_UNIT_TYPES = ['b', 'B', 'kb', 'kB', 'Mb', 'MB', 'Gb', 'GB',
   'Tb', 'TB', 'Pb', 'PB', 'Eb', 'EB', 'Kib', 'KiB', 'Mib',
   'MiB', 'Gib', 'GiB', 'Tib', 'TiB', 'Pib', 'PiB', 'Eib',
   'EiB']