3.2. Itertools#

itertools is a built-in Python library that creates iterators for efficient looping. This section will show you some useful methods of itertools.

3.2.1. itertools.combinations: A Better Way to Iterate Through a Pair of Values in a Python List#

If you want to iterate through a pair of values in a list and the order does not matter ((a,b) is the same as (b, a)), a naive approach is to use two for-loops.

num_list = [1, 2, 3]
(1, 2)
(1, 3)
(2, 3)
for i in num_list:
    for j in num_list:
        if i < j:
            print((i, j))
(1, 2)
(1, 3)
(2, 3)

However, using two for-loops is lengthy and inefficient. Use itertools.combinations instead:

from itertools import combinations

comb = combinations(num_list, 2)  # use this
for pair in list(comb):
    print(pair)
(1, 2)
(1, 3)
(2, 3)

3.2.2. itertools.product: Nested For-Loops in a Generator Expression#

Are you using nested for-loops to experiment with different combinations of parameters?

params = {
    "learning_rate": [1e-1, 1e-2, 1e-3],
    "batch_size": [16, 32, 64],
}
for learning_rate in params["learning_rate"]:
    for batch_size in params["batch_size"]:
        combination = (learning_rate, batch_size)
        print(combination)
(0.1, 16)
(0.1, 32)
(0.1, 64)
(0.01, 16)
(0.01, 32)
(0.01, 64)
(0.001, 16)
(0.001, 32)
(0.001, 64)

If so, use itertools.product instead.

itertools.product is more efficient than nested loop because product(A, B) returns the same as ((x,y) for x in A for y in B).

from itertools import product

params = {
    "learning_rate": [1e-1, 1e-2, 1e-3],
    "batch_size": [16, 32, 64],
}

for combination in product(*params.values()):
    print(combination)
(0.1, 16)
(0.1, 32)
(0.1, 64)
(0.01, 16)
(0.01, 32)
(0.01, 64)
(0.001, 16)
(0.001, 32)
(0.001, 64)

3.2.3. itertools.starmap: Apply a Function With More Than 2 Arguments to Elements in a List#

map is a useful method that allows you to apply a function to elements in a list. However, it can’t apply a function with more than one argument to a list.

def multiply(x: float, y: float):
    return x * y
nums = [(1, 2), (4, 2), (2, 5)]
list(map(multiply, nums))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_38110/240000324.py in <module>
      1 nums = [(1, 2), (4, 2), (2, 5)]
----> 2 list(map(multiply, nums))

TypeError: multiply() missing 1 required positional argument: 'y'

To apply a function with more than 2 arguments to elements in a list, use itertools.starmap. With starmap, elements in each tuple of the list nums are used as arguments for the function multiply.

from itertools import starmap

list(starmap(multiply, nums))
[2, 8, 10]

3.2.4. itertools.compress: Filter a List Using Booleans#

Normally, you cannot filter a list using a list.

fruits = ["apple", "orange", "banana", "grape", "lemon"]
chosen = [1, 0, 0, 1, 1]
fruits[chosen]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_40588/2755098589.py in <module>
      1 fruits = ['apple', 'orange', 'banana', 'grape', 'lemon']
      2 chosen = [1, 0, 0, 1, 1]
----> 3 fruits[chosen]

TypeError: list indices must be integers or slices, not list

To filter a list using a list of booleans, use itertools.compress instead

from itertools import compress

list(compress(fruits, chosen))
['apple', 'grape', 'lemon']

3.2.5. itertools.groupby: Group Elements in an Iterable by a Key#

If you want to group elements in a list by a key, use itertools.groupby. In the example below, I grouped elements in the list by the first element in each tuple.

from itertools import groupby

prices = [("apple", 3), ("orange", 2), ("apple", 4), ("orange", 1), ("grape", 3)]

key_func = lambda x: x[0]

# Sort the elements in the list by the key
prices.sort(key=key_func)

# Group elements in the list by the key
for key, group in groupby(prices, key_func):
    print(key, ":", list(group))
apple : [('apple', 3), ('apple', 4)]
grape : [('grape', 3)]
orange : [('orange', 2), ('orange', 1)]

3.2.6. itertools.zip_longest: Zip Iterables of Different Lengths#

zip allows you to aggregate elements from each of the iterables. However, zip doesn’t show all pairs of elements when iterables have different lengths.

fruits = ["apple", "orange", "grape"]
prices = [1, 2]
list(zip(fruits, prices))
[('apple', 1), ('orange', 2)]

To aggregate iterables of different lengths, use itertools.zip_longest. This method will fill missing values with fillvalue.

from itertools import zip_longest
list(zip_longest(fruits, prices, fillvalue="-"))
[('apple', 1), ('orange', 2), ('grape', '-')]

3.2.7. itertools.dropwhile: Drop Elements in an Iterable Until a Condition Is False#

If you want to drop elements from an iterable until a condition is false, use itertools.dropwhile.

from itertools import dropwhile

nums = [1, 2, 5, 2, 4]

# Drop every number until a number >= 5
list(dropwhile(lambda n: n < 5, nums))
[5, 2, 4]
word = 'abcNice!'

# Drop every char until a char is in upper case
chars  = dropwhile(lambda char: char.islower(), word)
''.join(chars)
'Nice!'