String Methods

The str data type is a built-in class whose instances include more than 30 methods for parsing, transforming, splitting and joining its content. Here is a quick reference to learn the most used string methods.

ParsingΒΆ

The count() method returns the number of times the specified set of characters appears within the string.

>>> s = "Hello world"
>>> s.count("Hello")
1
>>> s.count("o")
2
>>> s.count("x")
0

The find() and index() methods return the location (starting at 0) where the specified argument is found.

>>> s.find("world")
6
>>> s.index("world")
6

They differ in that the latter throws ValueError when the argument is not found, while the former returns -1.

>>> s.find("mundo")
-1
>>> s.index("mundo")
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
ValueError: substring not found

In both methods the search occurs from left to right. To search for a set of characters starting from the end of the string, use rfind() and rindex() in the same way.

>>> s = "C:/path/to/something"
>>> s.find("/")  # Returns the first occurrence.
2
>>> s.rfind("/")  # Returns the last one.
10

startswith() and endswith() tell whether the string begins or ends with the set of characters passed as argument.

>>> s = "Hello world"
>>> s.startswith("Hello")
True
>>> s.endswith("world")
True
>>> s.endswith("mundo")
False

These methods are preferred over slicing:

# startswith() is preferred.
>>> s[:4] == "Hello"
True

The isdigit(), isnumeric(), and isdecimal() methods determine whether all characters in the string are digits, numbers, or decimal numbers (= base ten numbers, usually opposed to base 16 or hexadecimal numbers).

>>> "1234".isnumeric()
True
>>> "1234".isdecimal()
True
>>> "abc123".isdigit()
False

isdigit() considers any character that can form a number, including those that belong to oriental languages. isnumeric() is broader, as it also includes characters with numeric connotation that are not necessarily digits (for example, a fraction). Thus:

>>> "β…•".isnumeric()
True
>>> "β…•".isdigit()
False

The isdecimal() method is the most restrictive as it only takes into account decimal numbers, i.e., numbers formed by digits ranging from 0 to 9.

The following six parsing functions are pretty self-explanatory:

# Determines if all characters are alphanumeric.
>>> "abc123".isalnum()
True
# Determines if all characters are alphabetic.
>>> "abcdef".isalpha()
True
>>> "abc123".isalpha()
False
# Determines if all letters are lowercase.
>>> "abcdef".islower()
True
# Capital letters.
>>> "ABCDEF".isupper()
True
# Determines if the string contains all printable characters.
>>> "Hello \t world!".isprintable()
False
# Determines if the string contains only spaces.
>>> "Hello world".isspace()
False
>>> "    ".isspace()
True

TransformingΒΆ

Remember that strings in Python are immutable. Therefore, all the methods below do not act on the original object but return a new one.

capitalize() returns the string with its first letter capitalized.

>>> "hello world".capitalize()
'Hello world'

encode() encodes the string with the specified character map and returns a bytes instance.

>>> "Hello world".encode("utf-8")
b'Hello world'

The original string might be recovered by using the bytes.decode() mehotd.

>>> b'Hello world'.decode("utf-8")
'Hello world'

The center(), ljust(), and rjust() methods align a string to center, left, or right, respectively. They take one argument specifies the length or width of the returned string. For example, if the original string has 4 characters and the passed width is 11, the remaining 7 characters are filled with spaces.

>>> "Hello".center(11)
'   Hello   '
>>> "Hello".ljust(11)
'Hello      '
>>> "Hello".rjust(11)
'      Hello'

These methods are especially useful when printing in table format in order align each value. The second argument indicates the character to be used when filling the remaining space (" " by default).

>>> "Hello".center(11, "*")
'***Hello***'

lower() and upper() return a copy of the string with all its letters in lowercase or uppercase, respectively.

>>> "HeLlO wOrLd!".lower()
'hello world!'
>>> "HeLlO wOrLd!".upper()
'HELLO WORLD!'

While swapcase() changes uppercase to lowercase and vice-versa:

>>> "HeLlO wOrLd!".swapcase()
'hElLo WoRlD!'

The strip(), lstrip() and rstrip() functions remove the whitespaces that precedes and/or follows the string, if any.

>>> s = "  Hello world!  "
>>> s.strip()
'Hello world!'
>>> s.rstrip()  # Only right spaces.
'  Hello world!'
>>> s.lstrip()  # Left spaces.
'Hello world!  '

New lines (\n), tabs (\t), and carriage returns (\r) are also removed by these stripping functions.

Finally, the widely used replace() method replaces one string with another.

>>> s = "Hello world!"
>>> s.replace("world", "mundo")
'Hello mundo!'
>>> s.replace("o", "_")  # Each occurrence is replaced.
'Hell_ w_rld!'
>>> s  # Remember! The original string remains the same.
'Hello world!'

Splitting and JoiningΒΆ

The most used method for splitting a string according to a separator character is split(). The separator defaults to spaces, new lines (\n), tabs (\t), and carriage returns (\r).

>>> "Hello world!\nFrom\tPython!".split()
['Hello', 'world!', 'From', 'Python!']

The separator can be specified as an argument.

>>> "Hello world!\nFrom\tPython!".split(" ")
['Hello', 'world!\nFrom\tPython!']

There is also splitlines(), which is equivalent to split("\n"):

>>> "Hello world!\nFrom\tPython!".splitlines()
['Hello world!', 'From\tPython!']

A second argument to split() indicates the maximum number of splits that can take place (the default -1 means no limit).

# Only the first two spaces are splitted.
>>> "Hello world from Python!".split(" ", 2)
['Hello', 'world', 'from Python!']

The less used splitting method partition() returns a tuple of three elements: the character block before the first occurrence of the separator, the separator itself, and the block after it.

>>> "Hello world from Python!".partition(" ")
('Hello', ' ', 'world from Python!')

rpartition() works in a similar way, but searching from right to left.

>>> "Hello world from Python!".rpartition(" ")
('Hello world from', ' ', 'Python!')

Finally, the extremely useful join() method, which must be called from a string that acts as a separator to join the elements of a list or any other sequence.

>>> " ".join(["Hello", "world"])
'Hello world'
>>> ", ".join(["C", "C++", "Python", "Java"])
'C, C++, Python, Java'

As can be seen, split() and join() are exact opposites:

>>> sep = " "
>>> sep.join("Hello world!".split(sep))
'Hello world!'