About 5538 letters

About 28 minutes

#Python's strings

Strings can be considered as a list of characters, and they are a very common and feature-rich type.

We have already touched upon strings in the section Basic Syntax - Variables and Basic Types.

#Encoding and bytes

Computers store data in binary using electronic components, so characters must be mapped to binary values. This mapping is called encoding.
For example, the English letter A corresponds to the binary value 01000001 (decimal 65) in ASCII encoding.

The smallest storage unit in a computer is a byte, which has 8 bits.
In Python, there is a type called bytes that stores a sequence of bytes.

A bytes literal looks similar to a string but is prefixed with a b, such as b'hello world'.

Converting a string to bytes is called encoding, and converting bytes back to string is called decoding:

data: bytes = b'hello world' print(data) text: str = data.decode() # decode print(text) print(text.encode()) # encode

>>> Establishing WebAssembly Runtime.

>>> Standby.

Powered by Shift.

Bytes may look like strings with a b prefix, but there are key differences:

  • Bytes elements are bytes, while string elements are characters (one character may occupy multiple bytes)
  • Bytes can store non-text data, such as images
text: str = '你好世界' print(text) data: bytes = text.encode() # encode print(data) print(len(text), len(data)) # lengths differ print(text[1], data[1]) # text[1] is a whole Chinese character '好', while data[1] is a byte of the character '你'

>>> Establishing WebAssembly Runtime.

>>> Standby.

Powered by Shift.

#Old-style string formatting

In programming, you often need to create strings based on variables. You can use the % operator for formatting, with the syntax:

"format string" % (value1, value2, ...) # values are in a tuple

If there is only one value, you can omit the tuple:

"format string" % value

For example:

print("Pork price is %d yuan per jin" % 15) print("%d jin of pork costs %d yuan" % (3, 3*15))

>>> Establishing WebAssembly Runtime.

>>> Standby.

Powered by Shift.

Here %d is a decimal integer placeholder that will be replaced by the corresponding value in decimal form. Common placeholders include:

  • %%: literal percent sign
  • %d: decimal integer
  • %o: octal integer
  • %x: hexadecimal integer (lowercase)
  • %X: hexadecimal integer (uppercase)
  • %f: floating-point number
  • %s: string

This style is less common nowadays; see more details at printf-style String Formatting.

#The format() method

The format method is more flexible than %. It uses curly braces {} as placeholders and supports formatting within the braces, for example:

print("Name: {}, Age: {}".format("Jerry", 18)) # positional replacement print("Name: {1}, Age: {0}".format(19, "Tom")) # positional index print("Name: {name}, Age: {age}".format(name="Tuffy", age=8))# named replacement

>>> Establishing WebAssembly Runtime.

>>> Standby.

Powered by Shift.

You can specify width:

# Print multiplication table for x in range(1, 10): for y in range(1, 10): print(' {:2} '.format(x * y), end='') # min width 2 chars print('')

>>> Establishing WebAssembly Runtime.

>>> Standby.

Powered by Shift.

Width can be a variable:

print("'{:{width}}'".format('txt', width=7))

>>> Establishing WebAssembly Runtime.

>>> Standby.

Powered by Shift.

You can specify alignment:

print("'{:<5}'".format('txt')) # left-align with width 5 print("'{:>5}'".format('txt')) # right-align with width 5 print("'{:^5}'".format('txt')) # center with width 5

>>> Establishing WebAssembly Runtime.

>>> Standby.

Powered by Shift.

You can use indexing for dicts:

score_list: dict[str,int] = { 'Tom': 88, 'Jerry': 99, 'Spike': 66 } print("Scores: Tom:{0[Tom]} Jerry:{0[Jerry]} Spike:{0[Spike]}".format(score_list))

>>> Establishing WebAssembly Runtime.

>>> Standby.

Powered by Shift.

To keep n decimal places using format function:

print("Approximate value of pi is {}".format(format(3.1415926, '.2f'))) # two decimal places

>>> Establishing WebAssembly Runtime.

>>> Standby.

Powered by Shift.

See Format String Syntax.

#Formatted string literals (f-strings)

Formatted string literals use the syntax f'xxxx' or f"xxxx", with expressions inside {} evaluated:

score_list: dict[str,int] = { 'Tom': 88, 'Jerry': 99, 'Spike': 66 } print(f"Scores: Tom:{score_list['Tom']} Jerry:{score_list['Jerry']} Spike:{score_list['Spike']}")

>>> Establishing WebAssembly Runtime.

>>> Standby.

Powered by Shift.

#Raw strings

Raw string literals are prefixed with r'xxxx' or r"xxxx", where escape sequences are not processed, so \n is treated as two characters, \ and n:

print(r'hello \n world')

>>> Establishing WebAssembly Runtime.

>>> Standby.

Powered by Shift.

Raw strings are useful for regular expressions or other scenarios requiring many backslashes.

Regular expressions will be covered later.

#Multiline strings

Multiline strings use triple quotes (''' or """), for example:

print(''' ## Multiline strings Multiline strings use triple quotes (`'''` or `"""`). ''')

>>> Establishing WebAssembly Runtime.

>>> Standby.

Powered by Shift.

Multiline strings are also commonly used as multiline comments:

''' Not assigned to a variable and not evaluated, so acts as a comment. ''' print("hello world")

Multiline strings can also be combined with prefixes b, f, or r for bytes, formatted strings, or raw strings.

Created in 5/15/2025

Updated in 5/21/2025