Python String encode() Method
The str.encode()
method encodes the string using an encoding scheme specified by the user and returns the encoded version of a string as a bytes object.
Note: By default, it encodes the string using the utf-8
encoding
Example:
string = "Hello World!"
# Encode the string using utf-8
encoded_string = string.encode()
print(encoded_string)
# Output: b'Hello World!'
Output
b'Hello World!'
Syntax
The Syntax of the str.encode()
function is:
str.encode(encoding='utf-8', errors='strict')
Here, str
is a string.
Parameters
The str.encode()
function can take two parameters.
- encoding (optional) - the encoding type that needs to be used to convert the string. For the complete list of encoding schemes, visit the Python Standard Encodings. The default is
UTF-8
encoding. *** errors (optional) -** Define the type of error in case of encoding errors. The default isstrict
, meaning if encoding fails, it raises aUnicodeError
. Following are the error response type it supports.- strict - The default response raises an
UnicodeError
in case of encoding failure. - ignore - Ignores the unencodable Unicode characters.
- replace - Replaces the unencodable Unicode with the question mark
?
- xmlcharrefreplace - Replaces the unencodable Unicode with an XML character
- backslashreplace - Replaces the unencodable Unicode characters with backslash
\
- namereplace - Replaces the unencodable Unicode with
\n
- strict - The default response raises an
Return Value
The str.encode()
function returns the encoded version of the string. The default is UTF-8
encoding if the encoding type is not passed.
Example 1: Encoding a string in various formats in Python
You can use any of the encodings supported by Python to encode a string. If you do not pass the encoding
argument, it defaults to utf-8
.
Some of the standard encodings used are:
- utf-8: A variable-width character encoding that can handle any character in the Unicode standard.
- ascii: A character encoding that represents each character as a single byte. It can only handle characters in the ASCII character set.
- ISO-8859-1: A character encoding that represents each character as a single byte. It can handle any character in the ISO-8859-1 character set.
string = "Python Programming"
# Encode the string using the ISO-8859-1 encoding
encoded_string = string.encode("ISO-8859-1")
print(encoded_string)
# Output: b'Hello World!'
Output
b'Python Programming'
Example 2: Handling UnicodeEncodeError
in case of strict encoding
In this example, we have a string that contains the Unicode character "ö", which is not part of the ASCII character set. When we try to encode this string using the utf-8
encoding, we get a UnicodeEncodeError
because utf-8
cannot encode the character "ö".
To handle this error, we use a try-except
block to catch the UnicodeEncodeError
and handle it in the except block. In this case, we just print the error message.
# Unicode string
str = "Pythön Programming"
try:
# Encode the string using ASCII, raising an error if any unencodable characters are encountered
encoded_string = str.encode("ascii", errors="strict")
print(encoded_string)
except UnicodeEncodeError as e:
# Handle the error
print(e)
Output
'ascii' codec can't encode character '\xf6' in position 4: ordinal not in range(128)
Example 3: Handling unencodable characters while using encode()
method
In this example, we start with a string that contains the character "ö", which is not part of the ASCII character set. When we try to encode this string using the utf-8
encoding, we get a UnicodeEncodeError
because utf-8
cannot encode the character "ö".
To handle this error, we pass the errors parameter to the encode()
method with a different set of values, as shown below.
# Unicode string
str = "Pythön Programming"
# print string
print("The original string is:", str)
# Encode the string using ascii, ignoring any unencodable characters
print(str.encode("ascii", "ignore"))
# Encode the string using ascii, replacing any unencodable characters with the '?' character
print(str.encode("ascii", "replace"))
# Encode the string using ascii, replacing any unencodable characters with XML character references
print(str.encode("ascii", "xmlcharrefreplace"))
# Encode the string using ascii, replacing any unencodable characters with backslash-escaped characters
print(str.encode("ascii", "backslashreplace"))
# Encode the string using ascii, replacing any unencodable characters with character names
print(str.encode("ascii", "namereplace"))
Output
The original string is: Pythön Programming
b'Pythn Programming'
b'Pyth?n Programming'
b'Pythön Programming'
b'Pyth\\xf6n Programming'
b'Pyth\\N{LATIN SMALL LETTER O WITH DIAERESIS}n Programming'
Reference: Python Official Docs