W3Basic Logo

Python String encode() Method

The str.encode() method encodes the string using an encoding scheme specified by the user and returns the encoded version of a string as a bytes object.

Note: By default, it encodes the string using the utf-8 encoding

Example:

string = "Hello World!"

# Encode the string using utf-8
encoded_string = string.encode()

print(encoded_string)
# Output: b'Hello World!'

Output

b'Hello World!'

Syntax

The Syntax of the str.encode() function is:

str.encode(encoding='utf-8', errors='strict')

Here, str is a string.

Parameters

The str.encode() function can take two parameters.

  • encoding (optional) - the encoding type that needs to be used to convert the string. For the complete list of encoding schemes, visit the Python Standard Encodings. The default is UTF-8 encoding. *** errors (optional) -** Define the type of error in case of encoding errors. The default is strict, meaning if encoding fails, it raises a UnicodeError. Following are the error response type it supports.
    1. strict - The default response raises an UnicodeError in case of encoding failure.
    2. ignore - Ignores the unencodable Unicode characters.
    3. replace - Replaces the unencodable Unicode with the question mark ?
    4. xmlcharrefreplace - Replaces the unencodable Unicode with an XML character
    5. backslashreplace - Replaces the unencodable Unicode characters with backslash \
    6. namereplace - Replaces the unencodable Unicode with \n

Return Value

The str.encode() function returns the encoded version of the string. The default is UTF-8 encoding if the encoding type is not passed.

Example 1: Encoding a string in various formats in Python

You can use any of the encodings supported by Python to encode a string. If you do not pass the encoding argument, it defaults to utf-8.

Some of the standard encodings used are:

  • utf-8: A variable-width character encoding that can handle any character in the Unicode standard.
  • ascii: A character encoding that represents each character as a single byte. It can only handle characters in the ASCII character set.
  • ISO-8859-1: A character encoding that represents each character as a single byte. It can handle any character in the ISO-8859-1 character set.
string = "Python Programming"

# Encode the string using the ISO-8859-1 encoding
encoded_string = string.encode("ISO-8859-1")

print(encoded_string)
# Output: b'Hello World!'

Output

b'Python Programming'

Example 2: Handling UnicodeEncodeError in case of strict encoding

In this example, we have a string that contains the Unicode character "ö", which is not part of the ASCII character set. When we try to encode this string using the utf-8 encoding, we get a UnicodeEncodeError because utf-8 cannot encode the character "ö".

To handle this error, we use a try-except block to catch the UnicodeEncodeError and handle it in the except block. In this case, we just print the error message.

# Unicode string
str = "Pythön Programming"

try:
    # Encode the string using ASCII, raising an error if any unencodable characters are encountered
    encoded_string = str.encode("ascii", errors="strict")
    print(encoded_string)
except UnicodeEncodeError as e:
    # Handle the error
    print(e)

Output

'ascii' codec can't encode character '\xf6' in position 4: ordinal not in range(128)

Example 3: Handling unencodable characters while using encode() method

In this example, we start with a string that contains the character "ö", which is not part of the ASCII character set. When we try to encode this string using the utf-8 encoding, we get a UnicodeEncodeError because utf-8 cannot encode the character "ö".

To handle this error, we pass the errors parameter to the encode() method with a different set of values, as shown below.

# Unicode string
str = "Pythön Programming"

# print string
print("The original string is:", str)

# Encode the string using ascii, ignoring any unencodable characters
print(str.encode("ascii", "ignore"))

# Encode the string using ascii, replacing any unencodable characters with the '?' character
print(str.encode("ascii", "replace"))

# Encode the string using ascii, replacing any unencodable characters with XML character references
print(str.encode("ascii", "xmlcharrefreplace"))

# Encode the string using ascii, replacing any unencodable characters with backslash-escaped characters
print(str.encode("ascii", "backslashreplace"))

# Encode the string using ascii, replacing any unencodable characters with character names
print(str.encode("ascii", "namereplace"))

Output

The original string is: Pythön Programming
b'Pythn Programming'
b'Pyth?n Programming'
b'Pythön Programming'
b'Pyth\\xf6n Programming'
b'Pyth\\N{LATIN SMALL LETTER O WITH DIAERESIS}n Programming'

Reference: Python Official Docs

© 2023 W3Basic. All rights reserved.

Follow Us: