BitVector¶
BitVector is a class that allows for bit-level manipulation of data. It is designed to have all the functionality of Python’s built-in bytearray class, and to that end, it supports almost all of the same methods, but with bit-level granularity.
[2]:
from bytemaker.bitvector import BitVector
from bitarray import bitarray
Construction¶
BitVector has almost all of the same construction options as bytearray. In the case where it does not (string encoding), a constructed ByteArray or String bittype can be used.
[202]:
# Default constructor
ba_empty = bytearray()
bv_empty = BitVector()
print(f"Default construction:\n{ba_empty}\n{bv_empty}")
print("------------------------------------------------")
# Construct from list
ba = bytearray([1, 2, 3, 2, 255])
bv = BitVector([0, 1, 1, 0, 1])
print(f"List construction:\n{ba}\n{bv}")
print("------------------------------------------------")
# Construct from bytes
ba = bytearray(b"Hello, World!")
bv = BitVector(b"Hello, World!")
print(f"Bytes construction:\n{ba}\n{bv}")
print("------------------------------------------------")
# Construct from int
ba = bytearray(10)
bv = BitVector(10)
print(f"Int construction:\n{ba}\n{bv}")
print("------------------------------------------------")
# Construct from string
ba = bytearray("Hello, World!", "utf-16")
# Unsupported for BitVector. However, the below is possible
bv = BitVector(bytes("Hello, World!", "utf-16"))
print(f"String construction:\n{ba}\n{bv}")
Default construction:
bytearray(b'')
BitVector('')
------------------------------------------------
List construction:
bytearray(b'\x01\x02\x03\x02\xff')
BitVector('01101')
------------------------------------------------
Bytes construction:
bytearray(b'Hello, World!')
BitVector('01001000 01100101 01101100 01101100 01101111 00101100 00100000 01010111 01101111 01110010 01101100 01100100 00100001')
------------------------------------------------
Int construction:
bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
BitVector('00000000 00')
------------------------------------------------
String construction:
bytearray(b'\xff\xfeH\x00e\x00l\x00l\x00o\x00,\x00 \x00W\x00o\x00r\x00l\x00d\x00!\x00')
BitVector('11111111 11111110 01001000 00000000 01100101 00000000 01101100 00000000 01101100 00000000 01101111 00000000 00101100 00000000 00100000 00000000 01010111 00000000 01101111 00000000 01110010 00000000 01101100 00000000 01100100 00000000 00100001 00000000')
BitVectors can also be obtained by generating BitTypes objects from pythonic types and grabbing the underlying BitVector. See BitVector for more.
Presentation Options¶
BitArray objects support the same presentation options as bytearray, and then some. Bits can be shown in binary, octal, hex, or a user-specified base. Each of these options supports custom separators and bytes-per-sep spacers. Finally, the
[13]:
the_bitvector = BitVector(bytes("Hello, Worlds!", "utf-16"))
print(bytes(the_bitvector))
print(the_bitvector.tobase(64))
print(the_bitvector.hex(sep=" ", bytes_per_sep=4))
print(the_bitvector.oct())
print(the_bitvector.tobase(4, sep="_", bytes_per_sep=2))
print(the_bitvector.bin())
print(list(the_bitvector))
print(the_bitvector.to_chararray(encoding="utf-16"))
b'\xff\xfeH\x00e\x00l\x00l\x00o\x00,\x00 \x00W\x00o\x00r\x00l\x00d\x00s\x00!\x00'
//5IAGUAbABsAG8ALAAgAFcAbwByAGwAZABzACEA
0xfffe4800 65006c00 6c006f00 2c002000 57006f00 72006c00 64007300 2100
0o77777110000624003300015400067400130000400005340033600162000660003100016300020400
33333332_10200000_12110000_12300000_12300000_12330000_02300000_02000000_11130000_12330000_13020000_12300000_12100000_13030000_02010000
0b111111111111111001001000000000000110010100000000011011000000000001101100000000000110111100000000001011000000000000100000000000000101011100000000011011110000000001110010000000000110110000000000011001000000000001110011000000000010000100000000
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
Hello, Worlds!
__bytes__ and the buffer protocol¶
Additionally, BitVector supports __bytes__, meaning it can be readily cast to bytes objects. For non-byte-aligned BitVectors, additional zeros will be right-padded.
Warning¶
It can also be cast to bytearray objects, but please note that BitVector supports the Buffer protocol currently, meaning that bytearray objects constructed with BitVector objects will share memory with BitVector. This can lead to ambiguous results if the BitVector does not have a multiple of 8 bits (is not byte-aligned with its theoretical terminus). To get around this, cast to bytes first (which prioritizes __bytes__).
[198]:
ba_1 = bytearray(b"Hello, World!")
bv_1_unaligned = BitVector("10101010 0001")
ba_2 = bytearray(bv_1_unaligned)
ba_2_with_alignment = bytearray(bytes(bv_1_unaligned))
bv_2 = BitVector(ba_1)
print(f"From bytearray to BitVector:\n{bv_2}")
print(f"From BitVector to bytearray:\nUnaligned: {ba_2}\nAligned: {ba_2_with_alignment}")
From bytearray to BitVector:
BitVector('01001000 01100101 01101100 01101100 01101111 00101100 00100000 01010111 01101111 01110010 01101100 01100100 00100001')
From BitVector to bytearray:
Unaligned: bytearray(b'\xaa\x12')
Aligned: bytearray(b'\xaa\x10')
Magic Methods¶
BitVector supports the following magic methods:
Unary operations: __invert__ Binary operations: __and__, __or__, __xor__, __lshift__, __rshift__, __eq__, __ne__, __lt__, __le__, __gt__, __ge__ Shifting: __lshift__, __rshift__ Indexing and iteration: __getitem__, __setitem__, __delitem__, __len__, __iter__ String representation: __str__, __repr__, __format__ Concatenation: __add__, __radd__, __iadd__, __mul__, __rmul__, __imul__
Other: __contains__, __sizeof__, __copy__, __deepcopy__ Right operations: __rand__, __ror__, __rxor__, __rlshift__, __rrshift__
[24]:
print("Initial BitVector")
bv = BitVector("111110000")
print(bv)
print("Shifting by 3 << then >>")
bv = bv << 3
print(bv)
bv = bv >> 3
print("----------------------")
print("Concatenation")
print("BitVector bv:", bv)
bv2 = bv + bv
print("bv2 = bv + bv:", bv2)
bv3 = bv2 * 2
print("bv3 = bv2 * 2:", bv3)
print("----------------------")
print("Bitwise operations")
bv1 = bv & bv << 1
bv2 = bv | bv << 1
print(bv, "&", bv<<1, ":", bv1)
print(bv, "|", bv<<1, ":", bv2)
print(f"~{bv}:", ~bv)
print("----------------------")
print("Indexing")
print("[i for i in bv]:", [i for i in bv])
print("[bv[i] for i in range(0, len(bv))]:", [bv[i] for i in range(0, len(bv))])
print("bv[0:4]", bv[0:4])
print("----------------------")
print("Comparisons")
print(f"{bv} == {bv}:", bv == bv)
print(f"{bv} == {bv >> 1}:", bv >> 1)
print(f"{bv} < {bv << 1}:", bv < bv << 1)
Initial BitVector
BitVector('11111000 0')
Shifting by 3 << then >>
BitVector('11000000 0')
----------------------
Concatenation
BitVector bv: BitVector('00011000 0')
bv2 = bv + bv: BitVector('00011000 00001100 00')
bv3 = bv2 * 2: BitVector('00011000 00001100 00000110 00000011 0000')
----------------------
Bitwise operations
BitVector('00011000 0') & BitVector('00110000 0') : BitVector('00010000 0')
BitVector('00011000 0') | BitVector('00110000 0') : BitVector('00111000 0')
~BitVector('00011000 0'): BitVector('11100111 1')
----------------------
Indexing
[i for i in bv]: [0, 0, 0, 1, 1, 0, 0, 0, 0]
[bv[i] for i in range(0, len(bv))]: [0, 0, 0, 1, 1, 0, 0, 0, 0]
bv[0:4] BitVector('0001')
----------------------
Comparisons
BitVector('00011000 0') == BitVector('00011000 0'): True
BitVector('00011000 0') == BitVector('00001100 0'): BitVector('00001100 0')
BitVector('00011000 0') < BitVector('00110000 0'): True
[ ]:
## Location and partitioning