str data type is a built-in class whose instances include more than 30 methods for parsing, transforming, splitting and joining its content. Here is a quick reference to learn the most used string methods.
count() method returns the number of times the specified set of characters appears within the string.
index() methods return the location (starting at 0) where the specified argument is found.
They differ in that the latter throws
ValueError when the argument is not found, while the former returns
>>> s.find("mundo") -1 >>> s.index("mundo") Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: substring not found
In both methods the search occurs from left to right. To search for a set of characters starting from the end of the string, use
rindex() in the same way.
>>> s = "C:/path/to/something" >>> s.find("/") # Returns the first occurrence. 2 >>> s.rfind("/") # Returns the last one. 10
endswith() tell whether the string begins or ends with the set of characters passed as argument.
>>> s = "Hello world" >>> s.startswith("Hello") True >>> s.endswith("world") True >>> s.endswith("mundo") False
These methods are preferred over slicing:
isdecimal() methods determine whether all characters in the string are digits, numbers, or decimal numbers (= base ten numbers, usually opposed to base 16 or hexadecimal numbers).
isdigit() considers any character that can form a number, including those that belong to oriental languages.
isnumeric() is broader, as it also includes characters with numeric connotation that are not necessarily digits (for example, a fraction). Thus:
isdecimal() method is the most restrictive as it only takes into account decimal numbers, i.e., numbers formed by digits ranging from 0 to 9.
The following six parsing functions are pretty self-explanatory:
# Determines if all characters are alphanumeric. >>> "abc123".isalnum() True # Determines if all characters are alphabetic. >>> "abcdef".isalpha() True >>> "abc123".isalpha() False # Determines if all letters are lowercase. >>> "abcdef".islower() True # Capital letters. >>> "ABCDEF".isupper() True # Determines if the string contains all printable characters. >>> "Hello \t world!".isprintable() False # Determines if the string contains only spaces. >>> "Hello world".isspace() False >>> " ".isspace() True
Remember that strings in Python are immutable. Therefore, all the methods below do not act on the original object but return a new one.
capitalize() returns the string with its first letter capitalized.
encode() encodes the string with the specified character map and returns a
The original string might be recovered by using the
rjust() methods align a string to center, left, or right, respectively. They take one argument specifies the length or width of the returned string. For example, if the original string has 4 characters and the passed width is 11, the remaining 7 characters are filled with spaces.
These methods are especially useful when printing in table format in order align each value. The second argument indicates the character to be used when filling the remaining space (" " by default).
upper() return a copy of the string with all its letters in lowercase or uppercase, respectively.
swapcase() changes uppercase to lowercase and vice-versa:
rstrip() functions remove the whitespaces that precedes and/or follows the string, if any.
>>> s = " Hello world! " >>> s.strip() 'Hello world!' >>> s.rstrip() # Only right spaces. ' Hello world!' >>> s.lstrip() # Left spaces. 'Hello world! '
New lines (
\n), tabs (
\t), and carriage returns (
\r) are also removed by these stripping functions.
Finally, the widely used
replace() method replaces one string with another.
Splitting and Joining¶
The most used method for splitting a string according to a separator character is
split(). The separator defaults to spaces, new lines (
\n), tabs (
\t), and carriage returns (
The separator can be specified as an argument.
There is also
splitlines(), which is equivalent to
A second argument to
split() indicates the maximum number of splits that can take place (the default
-1 means no limit).
# Only the first two spaces are splitted. >>> "Hello world from Python!".split(" ", 2) ['Hello', 'world', 'from Python!']
The less used splitting method
partition() returns a tuple of three elements: the character block before the first occurrence of the separator, the separator itself, and the block after it.
rpartition() works in a similar way, but searching from right to left.
Finally, the extremely useful
join() method, which must be called from a string that acts as a separator to join the elements of a list or any other sequence.
>>> " ".join(["Hello", "world"]) 'Hello world' >>> ", ".join(["C", "C++", "Python", "Java"]) 'C, C++, Python, Java'
As can be seen,
join() are exact opposites: