Sequences in Python: The Type Bytes

A string is a sequence of characters, a sequence being defined as a collection within which order matters. Strings are commonly used for communication between computers and humans: to print headings and values on the screen, and to read objects in character string form. Humans deal with characters very well. The type bytes represents a sequence of integers, albeit small ones. A bytes object of length 1 is an 8-bit integer, or a value between 0 and 255. A bytes object of length greater than 1 is a sequence of small integers. To be clear, if s is a string and b is a bytes object, then

s[i] is a character

b[i] is a small integer

A string constant (literal) is a sequence of characters enclosed in quotes. A bytes literal is a sequence of character enclosed in quotes and preceded by the letter “b.” Thus

‘this is a string’

is a string, whereas

b’this is a string’

has type bytes. Any method that applies to a string also applies to a bytes object, but bytes objects have some new ones. In particular, to convert a bytes object to a string, the decode() method is used, and a character encoding should be given as the parameter. If no parameter is given, then the decoding method is the one currently being used. There are a few possible decoding methods (e.g., utf-8). To convert a bytes object b to a character string s, the following would work:

s = b.decode (“utf-8”)

A question remains: why is the bytes type needed? The bytes type implements the buffer interface. Certain file operations require a buffer interface to accomplish their tasks. Anything read from some specific types of files will be of the type bytes, for example, as it has that interface.

Source: Parker James R. (2021), Python: An Introduction to Programming, Mercury Learning and Information; Second edition.

Leave a Reply Cancel reply

Login