str data type is a built-in class whose instances include more than 30 methods for parsing, transforming, splitting and joining its content. Here is a quick reference to learn the most used string methods.
count() method returns the number of times the specified set of characters appears within the string.
>>> s = "Hello world" >>> s.count("Hello") 1 >>> s.count("o") 2 >>> s.count("x") 0
index() methods return the location (starting at 0) where the specified argument is found.
>>> s.find("world") 6 >>> s.index("world") 6
They differ in that the latter throws
ValueError when the argument is not found, while the former returns
>>> s.find("mundo") -1 >>> s.index("mundo") Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: substring not found
In both methods the search occurs from left to right. To search for a set of characters starting from the end of the string, use
rindex() in the same way.
>>> s = "C:/path/to/something" >>> s.find("/") # Returns the first occurrence. 2 >>> s.rfind("/") # Returns the last one. 10
endswith() tell whether the string begins or ends with the set of characters passed as argument.
>>> s = "Hello world" >>> s.startswith("Hello") True >>> s.endswith("world") True >>> s.endswith("mundo") False
These methods are preferred over slicing:
# startswith() is preferred. >>> s[:4] == "Hello" True
isdecimal() methods determine whether all characters in the string are digits, numbers, or decimal numbers (= base ten numbers, usually opposed to base 16 or hexadecimal numbers).
>>> "1234".isnumeric() True >>> "1234".isdecimal() True >>> "abc123".isdigit() False
isdigit() considers any character that can form a number, including those that belong to oriental languages.
isnumeric() is broader, as it also includes characters with numeric connotation that are not necessarily digits (for example, a fraction). Thus:
>>> "⅕".isnumeric() True >>> "⅕".isdigit() False
isdecimal() method is the most restrictive as it only takes into account decimal numbers, i.e., numbers formed by digits ranging from 0 to 9.
The following six parsing functions are pretty self-explanatory:
# Determines if all characters are alphanumeric. >>> "abc123".isalnum() True # Determines if all characters are alphabetic. >>> "abcdef".isalpha() True >>> "abc123".isalpha() False # Determines if all letters are lowercase. >>> "abcdef".islower() True # Capital letters. >>> "ABCDEF".isupper() True # Determines if the string contains all printable characters. >>> "Hello \t world!".isprintable() False # Determines if the string contains only spaces. >>> "Hello world".isspace() False >>> " ".isspace() True
Remember that strings in Python are immutable. Therefore, all the methods below do not act on the original object but return a new one.
capitalize() returns the string with its first letter capitalized.
>>> "hello world".capitalize() 'Hello world'
encode() encodes the string with the specified character map and returns a
>>> "Hello world".encode("utf-8") b'Hello world'
The original string might be recovered by using the
>>> b'Hello world'.decode("utf-8") 'Hello world'
rjust() methods align a string to center, left, or right, respectively. They take one argument specifies the length or width of the returned string. For example, if the original string has 4 characters and the passed width is 11, the remaining 7 characters are filled with spaces.
>>> "Hello".center(11) ' Hello ' >>> "Hello".ljust(11) 'Hello ' >>> "Hello".rjust(11) ' Hello'
These methods are especially useful when printing in table format in order align each value. The second argument indicates the character to be used when filling the remaining space (" " by default).
>>> "Hello".center(11, "*") '***Hello***'
upper() return a copy of the string with all its letters in lowercase or uppercase, respectively.
>>> "HeLlO wOrLd!".lower() 'hello world!' >>> "HeLlO wOrLd!".upper() 'HELLO WORLD!'
swapcase() changes uppercase to lowercase and vice-versa:
>>> "HeLlO wOrLd!".swapcase() 'hElLo WoRlD!'
rstrip() functions remove the whitespaces that precedes and/or follows the string, if any.
>>> s = " Hello world! " >>> s.strip() 'Hello world!' >>> s.rstrip() # Only right spaces. ' Hello world!' >>> s.lstrip() # Left spaces. 'Hello world! '
New lines (
\n), tabs (
\t), and carriage returns (
\r) are also removed by these stripping functions.
Finally, the widely used
replace() method replaces one string with another.
>>> s = "Hello world!" >>> s.replace("world", "mundo") 'Hello mundo!' >>> s.replace("o", "_") # Each occurrence is replaced. 'Hell_ w_rld!' >>> s # Remember! The original string remains the same. 'Hello world!'
Splitting and Joining¶
The most used method for splitting a string according to a separator character is
split(). The separator defaults to spaces, new lines (
\n), tabs (
\t), and carriage returns (
>>> "Hello world!\nFrom\tPython!".split() ['Hello', 'world!', 'From', 'Python!']
The separator can be specified as an argument.
>>> "Hello world!\nFrom\tPython!".split(" ") ['Hello', 'world!\nFrom\tPython!']
There is also
splitlines(), which is equivalent to
>>> "Hello world!\nFrom\tPython!".splitlines() ['Hello world!', 'From\tPython!']
A second argument to
split() indicates the maximum number of splits that can take place (the default
-1 means no limit).
# Only the first two spaces are splitted. >>> "Hello world from Python!".split(" ", 2) ['Hello', 'world', 'from Python!']
The less used splitting method
partition() returns a tuple of three elements: the character block before the first occurrence of the separator, the separator itself, and the block after it.
>>> "Hello world from Python!".partition(" ") ('Hello', ' ', 'world from Python!')
rpartition() works in a similar way, but searching from right to left.
>>> "Hello world from Python!".rpartition(" ") ('Hello world from', ' ', 'Python!')
Finally, the extremely useful
join() method, which must be called from a string that acts as a separator to join the elements of a list or any other sequence.
>>> " ".join(["Hello", "world"]) 'Hello world' >>> ", ".join(["C", "C++", "Python", "Java"]) 'C, C++, Python, Java'
As can be seen,
join() are exact opposites:
>>> sep = " " >>> sep.join("Hello world!".split(sep)) 'Hello world!'