Lists and Tuples

Python Assets

2022-02-13

In this posts we intend to introduce the use of lists and tuples for those who are new to programming or are migrating from some other language, as well as those who have been in Python for a while and want to expand their knowledge and improve their code. As a first approach, it could be said that lists and tuples are what we in other languages call vectors or arrays. However, they have several differences that we'll learn through this tutorial.

A list is not the same as a tuple. Both are an ordered set of values. Values can be any Python object: numbers, strings, functions, classes, instances, etc., and even other lists and tuples. The difference is that lists have a series of additional functions that allow an extensive handling of the values they contain. On the contrary, the content of a tuple cannot be changed once created. Thus we can state that lists are dynamic or mutable, while tuples are static or immutable. Let's start with lists.

Creating a List¶

There are two ways to create a list: either by using square brackets or by calling the built-in list() function. There is rarely a reason to use the later, but it is worth knowing that it exists.

	`>>> a = []`
	`>>> b = list()`
	`>>> a == b`
	`True`

Two square brackets create an empty list with no values. These can be specified during creation by entering them inside the square brackets and separated by commas.

a = [1, 2, 3, 4]

Each of the values is generally called an element. Note that it is not necessary to indicate the number of elements that the object has or will have. In this case, the list has four numeric elements. However, both lists and tuples can contain elements of different types, for example:

b = [5, "Hello world!", (1, 2), True, -1.5]

The third element of this b list is a tuple, as we shall see now.

Creating a Tuple¶

As with lists, there are two ways to create a tuple:

	`>>> a = ()`
	`>>> b = tuple()`
	`>>> a == b`
	`True`

Since tuples are immutable, you must specify its elements during creation:

a = (5, "Hello world!", True, -1.5)

If you want to create a tuple with a single element, you must add a comma before closing the last parentheses, since parentheses are also used to group expressions:

	`>>> b = (5,) # This is a tuple.`
	`>>> type(b)`
	`<type 'tuple'>`
	`>>> c = (5) # This is a number.`
	`>>> type(c)`
	`<type 'int'>`

Reading and Writing Elements¶

You can access the different elements of a list or tuple by indicating the index (starting from 0) between square brackets.

	`>>> a = ["Hello", "world", "!"]`
	`>>> a[0]`
	`'Hello'`
	`>>> a[1]`
	`'world'`
	`>>> a[2]`
	`'!'`

If the index is out of range, you will get an exception.

	`>>> a[3]`
	`Traceback (most recent call last):`
	`File "<stdin>", line 1, in <module>`
	`IndexError: list index out of range`

This happens because a contains only three elements identified by the 0, 1, and 2 indexes.

Once a list is created, you can add as many values as you want.

	`>>> a = [1, 3, 5, 7]`
	`>>> a.append(9)`
	`>>> to`
	`[1, 3, 5, 7, 9]`

The append() method will add the specified element to the end of the list.

You can also alter existing elements by combining the access method with the assignment method:

	`>>> a[2] = -10`
	`>>> a[0] = "This is a list."`
	`>>> a`
	`['This is a list.', 3, -10, 7]`

Notice how the values in indexes 2 and 0 have been modified.

You might insert an element wherever you want, using the insert() method. For example:

	`>>> a = [1, 3, 4, 5, 6]`
	`>>> a.insert(1, 2)`
	`>>> a`
	`[1, 2, 3, 4, 5, 6]`

The insert() method takes the place or index as the first argument, and the element as the second one. Other examples:

	`>>> b = ["Hello", "Assets"]`
	`>>> b.insert(1, "Python")`
	`>>> b`
	`['Hello', 'Python', 'Assets']`

	`>>> c = [9, 10]`
	`>>> c.insert(3, 11)`
	`>>> c.insert(0, 8)`
	`>>> c`
	`[8, 9, 10, 11]`

Removing Elements¶

The remove() method of a list will remove the first element that matches the given value as an argument.

	`>>> a = ["Dog", "Cat", "Horse"]`
	`>>> a.remove("Cat")`
	`>>> a`
	`['Dog', 'Horse']`

If the element is not found in the list, ValueError is raised.

	`>>> a.remove("Fish")`
	`Traceback (most recent call last):`
	`File "<stdin>", line 1, in <module>`
	`ValueError: list.remove(x): x not in list`

To remove the element at a given position, use the del reserved keyword followed by list[index].

	`>>> a = ["Apple", "Pear", "Banana", "Orange"]`
	`>>> del a[1]`
	`>>> a`
	`['Apple', 'Banana', 'Orange']`
	`>>> del a[2]`
	`>>> a`
	`['Apple', 'Banana']`

And to remove all elements:

	`>>> del a[:]`
	`>>> a`
	`[]`

Later we will see how the the : character works and you will understand why this method is used. Not to be confused with del a, which will remove the reference to the list.

Counting Elements¶

The len() built-in function allows you to count the number of elements that a list or tuple contains.

	`>>> a = ("Dog", "Cat", "Horse")`
	`>>> len(a)`
	`3`

The len() function doesn't actually count the amount of elements every time it is called, it just reads an attribute stored in the list or tuple object. So feel free to use len() inside loops without caching.

Since the tuple has three elements, the last one will be found at len(a) - 1 (because indexes start from 0).

	`>>> a[len(a) - 1]`
	`'Horse'`

Casting¶

You can convert lists to tuples and vice-versa using the built-in functions we saw above: list() and tuple(). For example:

	`>>> a = [1, 2, 3]`
	`>>> a = tuple(a)`
	`>>> a`
	`(1, 2, 3)`
	`>>> a = list(a)`
	`>>> a`
	`[1, 2, 3]`

Slicing¶

To slice a list or a tuple means to extract a little portion of elements from within it.

	`>>> a = ["C", "C++", "Python", "Perl", "PHP", "Pascal"]`
	`>>> b = a[2:5]`
	`>>> b`
	`['Python', 'Perl', 'PHP']`

As you can see, a list or tuple can take a starting index and an ending index to order to take a slice from the sequence. The slicing expression creates a new a list or tuple and copies the elements within the range specified from the source sequence. So if you make changes to b, nothing happens in a, and vice-versa.

In this case, a[2:5] indicates to take the values at indexes 2, 3, and 4 (note that 5 is not included). The syntax is object[start:end]. If start is not specified, it defaults to 0. If end is not specified, it defaults to the amount of elements contained in the object, i. e. len(object).

	`>>> a[:4] # Same as a[0:4]`
	`['C', 'C++', 'Python', 'Perl']`
	`>>> a[4:] # Same as a a[4:len(a)]`
	`['PHP', 'Pascal']`
	`>>> a[:] # Same as a[0:len(a)]`
	`['C', 'C++', 'Python', 'Perl', 'PHP', 'Pascal']`
	`>>> a[0:len(a)] # Same as a[:]`
	`['C', 'C++', 'Python', 'Perl', 'PHP', 'Pascal']`

Also, negative values can be used as indexes to start counting from the end. For example, to get the last item in the list:

	`>>> a[-1:]`
	`['Pascal']`

Which is equivalent to:

	`>>> a[len(a) - 1:]`
	`['Pascal']`

And to get all elements but the last one:

	`>>> a[:-1]`
	`['C', 'C++', 'Python', 'Perl', 'PHP']`

Or removing the last three:

	`>>> a[:-3]`
	`['C', 'C++', 'Python']`

A third number called step might be added when slicing, which indicates the "space" between the elements taken from the original list or tuple. For example:

	`>>> a = list(range(20))`
	`>>> a`
	`[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]`
	`>>> a[5:15:2]`
	`[5, 7, 9, 11, 13]`

In this case, in a list of 20 elements (from 0 to 19), elements of indexes from 5 to 14 are taken (since the last number is exclusive) but instead of returning all of them, they are returned every two (because the step is 2). Other examples:

	`>>> a[::5]`
	`[0, 5, 10, 15]`
	`>>> a[5::3]`
	`[5, 8, 11, 14, 17]`
	`>>> a[::]`
	`[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]`
	`>>> a[::1]`
	`[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]`
	`>>> a[5:16:1]`
	`[5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]`

So the full slicing syntax is object[start:end:step], where step is the amount by which the index increases while taking elements from the sequence.

Finally, slicing can also be used to modify values as a whole. For example:

	`>>> a = ["C", "C++", "Python", "Perl", "PHP"]`
	`>>> a[1:4] = "C++11", "Python 2.7", "Perl 5"`
	`>>> a`
	`['C', 'C++11', 'Python 2.7', 'Perl 5', 'PHP']`

This does not apply to tuples since they do not allow assignment of new values.

Membership and Iteration¶

The clearest way to determine whether a value is inside a list or a tuple is to use the in reserved keyword.

	`>>> a = (1, 2, 3, 4)`
	`>>> 3 in a`
	`True`
	`>>> 5 in a`
	`False`
	`>>> "CPython" in ["PyPy", "Jython", "IronPython"]`
	`False`

And it might also be used along with the not keyword:

	`>>> 0.5 not in (1, 2, 3, 4)`
	`True`
	`>>> 7 not in (5, 6, 7, 8)`
	`False`

There are several methods to iterate or traverse the elements of a tuple or list. In other languages, a numeric variable is usually created that stores the value of the current index. For example, in C:

	`int i;`
	`int a[5] = {1, 2, 3, 4, 5};`

	`for (i = 0; i < 5; i++)`
	`printf("%d\n", a[i]);`

This translates to the Python language as:

	`a = (1, 2, 3, 4, 5)`

	`for i in range(len(a)):`
	`print(a[i])`

And produces the following output:

However, there is a nicer, cleaner method, in which the language assigns the proper value to an object during iteration.

	`for num in a:`
	`print(num)`

The indented part of the code (the one that starts with 4 spaces) will be executed len(a) times, that is, 5 times. In the first iteration, a will be 1; in the second, 2; in the third, 3; and so on. Let's look at another example:

	`a = ("Hello", "Python", "Assets", 2022, True)`

	`for item in a:`
	`print(item)`

Which prints:

Hello
Python
Assets
2022
True

Sometimes it might be necessary to keep the index of each element. The enumerate() built-in function can help us with this. To understand how it works, let's look at the following example.

	`>>> list(enumerate(a))`
	`[(0, 'Hello'), (1, 'Python'), (2, 'Resources'), (3, 2022), (4, True)]`

This creates an enumerate object, which converted to a list is a set of tuples containing (index, value) for each element. In addition, the number from which the count needs to start (0 by default) might be indicated.

	`>>> list(enumerate(a, 1))`
	`[(1, 'Hello'), (2, 'Python'), (3, 'Resources'), (4, 2022), (5, True)]`

When using it in a loop, there is no need to cast it to a list.

	`>>> for index, item in enumerate(a, 1):`
	`... print(f"Item {index}: {item}")`
	`...`
	`Item 1: Hello.`
	`Item 2: Python.`
	`Item 3: Assets.`
	`Item 4: 2022.`
	`Item 5: True.`

This piece of code is equivalent to the following in other languages:

	`for i in range(len(a)):`
	`print(f"Item {i + 1}: {a[i]}")`

Unpacking¶

To unpack a sequence in Python is to put the values within it into a set of objects.

obj1, obj2, obj3 = seq

This expression is equivalent to:

	`obj1 = seq[0]`
	`obj2 = seq[1]`
	`obj3 = seq[3]`

As long as seq contains three elements. If it has more or less, an exception (ValueError) will be raised. Some examples:

	`>>> data = ("Jorge", "Gonzalez", 30)`
	`>>> first_name, last_name, age = data`
	`>>> first_name`
	`'Jorge'`
	`>>> last_name`
	`'Gonzalez'`
	`>>> age`
	`30`
	`>>> implementations = ("CPython", "PyPy", "Jython", "IronPython")`
	`>>> first, second = implementations`
	`Traceback (most recent call last):`
	`File "<stdin>", line 1, in <module>`
	`ValueError: too many values to unpack`
	`>>> first, second = implementations[:2]`
	`>>> first`
	`'CPython'`
	`>>> second`
	`'PyPy'`
	`>>> penultimate, last = implementations[-2:]`
	`>>> penultimate`
	`'Jython'`
	`>>> last`
	`'IronPython'`

More methods and utilities¶

Below is an explanation directly put into practice, because the methods have very descriptive names.

	`# Add the elements to the end of the list.`
	`>>> a = [1, 2, 3, 4]`
	`>>> a.extend([5, 6, 7, 8, 9])`
	`>>> a`
	`[1, 2, 3, 4, 5, 6, 7, 8, 9]`

	`# Remove an item by index and get the removed value.`
	`>>> a = ["CPython", "PyPy", "Jython"]`
	`>>> removed = a.pop(1)`
	`>>> removed`
	`'PyPy'`
	`>>> to`
	`['CPython', 'Jython']`

	`# Get the index of an element by its value.`
	`>>> a = ["CPython", "PyPy", "Jython", "PyPy"]`
	`>>> a.index("Jython")`
	`2`
	`>>> a.index("CPython")`
	`0`
	`>>> a.index("PyPy") # Returns the index of the first match.`
	`1`
	`>>> a.index("IronPython") # If not found, raises an exception.`
	`Traceback (most recent call last):`
	`File "<stdin>", line 1, in <module>`
	`ValueError: 'IronPython' is not in list`

	`# How many times an item appears in the list?`
	`>>> a = ["Dog", "Cat", "Dog", "Horse", "Dog", "Cat"]`
	`>>> a.count("Dog")`
	`3`
	`>>> a.count("Horse")`
	`1`
	`>>> a.count("Cat")`
	`2`

	`# Sort elements from smallest to greatest.`
	`>>> a = [4, 7, 5, 6, 2, 1, 3]`
	`>>> a.sort()`
	`>>> a`
	`[1, 2, 3, 4, 5, 6, 7]`

	`# On strings, their numeric value will be determined`
	`# by the ASCII code of the first character.`
	`>>> a = ["Hello", "Python", "Assets"]`
	`>>> a.sort()`
	`>>> a`
	`['Assets', 'Hello', 'Python']`
	`>>> ord("H")`
	`72`
	`>>> ord("P")`
	`80`
	`>>> ord("A")`
	`65`

	`# Reverse the order of all elements.`
	`>>> a = [1, 2, 3, 4, 5]`
	`>>> a.reverse()`
	`>>> a`
	`[5, 4, 3, 2, 1]`

	`# Can be used in conjunction with sort() to sort from greatest to smallest.`
	`>>> a = [4, 7, 5, 6, 2, 1, 3]`
	`>>> a.sort()`
	`>>> a.reverse()`
	`>>> a`
	`[7, 6, 5, 4, 3, 2, 1]`

The zip() built-in function is very useful when we need to traverse two or more sequences at the same time. For example:

	`>>> headers = ("URL", "Title", "Subject")`
	`>>> values = ("pythonassets.com", "Python Assets", "The Python Programming Language")`
	`>>> zipped = zip(headers, values)`
	`>>> for name, value in zipped:`
	`... print(f"{name}: {value}")`
	`...`
	`URL: pythonassets.com`
	`Title: Python Assets`
	`Subject: The Python Programming Language`

Another example:

	`>>> first_names = "Jorge", "Ricardo", "Carlos"`
	`>>> last_names = "González", "Medina", "Perez"`
	`>>> ages = 30, 25, 41`
	`>>>`
	`>>> for first_name, last_name, age in zip(first_names, last_names, ages):`
	`... print(f"{first_name} {last_name} is {age} years old.")`
	`...`
	`Jorge González is 30 years old.`
	`Ricardo Medina is 25 years old.`
	`Carlos Perez is 41 years old.`

Sets¶

Finally, a set is a collection of unique, unordered values, which means its elements cannot be repeated. Neither can elements be accessed through an index, such as lists or tuples, since it is an unordered collection of values.

A set is created putting its elements between braces (like a dictionary, but no colons):

	`>>> a = {1, 2, 3}`
	`>>> type(a)`
	`<class 'set'>`

It might also be created from an iterable object (such as a list or tuple). For example:

	`>>> a = (1, 2, 3, 4)`
	`>>> b = set(a)`
	`>>> b`
	`set([1, 2, 3, 4])`

If the iterable object has repeated values, they will be removed.

	`>>> a = (1, 1, 2, 3, 4, 3, 5, 6, 7, 7, 6)`
	`>>> b = set(a)`
	`>>> b`
	`set([1, 2, 3, 4, 5, 6, 7])`

Taking this into account, an interesting method to remove repeated values from a list or tuple is:

	`b = list(set(a))`
	`d = tuple(set(c))`

Examples:

	`>>> a = (1, 1, 2, 2, 3, 3)`
	`>>> tuple(set(a))`
	`(1, 2, 3)`
	`>>> b = ["CPython", "IronPython", "Jython", "Jython", "CPython"]`
	`>>> list(set(b))`
	`['IronPython', 'Jython', 'CPython']`

Other set methods:

	`>>> a = {1, 2, 3, 4, 5}`
	`>>> a.add(6)`
	`>>> a`
	`{1, 2, 3, 4, 5, 6}`
	`>>> a.remove(2)`
	`>>> a`
	`{1, 3, 4, 5, 6}`
	`>>> 4 in a`
	`True`
	`>>> 2 in a`
	`False`
	`>>> 0 not in a`
	`True`