3.1. Collections#
collections is a built-in Python library to deal with Python dictionary efficiently. This section will show you some useful methods of this module.
3.1.1. collections.Counter: Count The Occurrences of Items in a List#
Counting the occurrences of each item in a list using a for-loop is slow and inefficient.
char_list = ["a", "b", "c", "a", "d", "b", "b"]
def custom_counter(list_: list):
char_counter = {}
for char in list_:
if char not in char_counter:
char_counter[char] = 1
else:
char_counter[char] += 1
return char_counter
custom_counter(char_list)
{'a': 2, 'b': 3, 'c': 1, 'd': 1}
Using collections.Counter
is more efficient, and all it takes is one line of code!
from collections import Counter
Counter(char_list)
Counter({'a': 2, 'b': 3, 'c': 1, 'd': 1})
In my experiment, using Counter
is more than 2 times faster than using a custom counter.
from timeit import timeit
import random
random.seed(0)
num_list = [random.randint(0, 22) for _ in range(1000)]
numExp = 100
custom_time = timeit("custom_counter(num_list)", globals=globals())
counter_time = timeit("Counter(num_list)", globals=globals())
print(custom_time / counter_time)
2.6199148843686806
3.1.2. namedtuple: A Lightweight Python Structure to Mange your Data#
If you need a small class to manage data in your project, consider using namedtuple.
namedtuple
object is like a tuple but can be used as a normal Python class.
In the code below, I use namedtuple
to create a Person
object with attributes name
and gender
.
from collections import namedtuple
Person = namedtuple("Person", "name gender")
oliver = Person("Oliver", "male")
khuyen = Person("Khuyen", "female")
oliver
Person(name='Oliver', gender='male')
khuyen
Person(name='Khuyen', gender='female')
Just like Python class, you can access attributes of namedtuple
using obj.attr
.
oliver.name
'Oliver'
3.1.3. Defaultdict: Return a Default Value When a Key is Not Available#
If you want to create a Python dictionary with default value, use defaultdict
. When calling a key that is not in the dictionary, the default value is returned.
from collections import defaultdict
classes = defaultdict(lambda: "Outside")
classes["Math"] = "B23"
classes["Physics"] = "D24"
classes["Spanish"]
'Outside'
3.1.4. Defaultdict: Create a Dictionary with Values that are List#
If you want to create a dictionary with the values that are list, the cleanest way is to pass a list class to a defaultdict
.
from collections import defaultdict
# Instead of this
food_price = {"apple": [], "orange": []}
# Use this
food_price = defaultdict(list)
for i in range(1, 4):
food_price["apple"].append(i)
food_price["orange"].append(i)
print(food_price.items())
dict_items([('apple', [1, 2, 3]), ('orange', [1, 2, 3])])
3.1.5. OrderedDict: Create an Ordered Python Dictionary#
Comparing two Python dictionaries ignores the order of items.
unordered1 = {'a': 1, 'b': 2, 'c': 3}
unordered2 = {'b': 2, 'a': 1, 'c': 3}
unordered1 == unordered2
True
If you want to consider the order of items, use OrderedDict
instead.
from collections import OrderedDict
ordered1 = OrderedDict({'a': 1, 'b': 2, 'c': 3})
ordered2 = OrderedDict({'b': 2, 'a': 1, 'c': 3})
ordered1 == ordered2
False
3.1.6. ChainMap: Combine Multiple Dictionaries into One Unit#
If you want to combine multiple dictionaries into one unit, collections.ChainMap
is a good option. ChainMap
allows you to organize and get the keys or values across different dictionaries.
from collections import ChainMap
fruits = {'apple': 2, 'tomato': 1}
veggies = {'carrot': 3, 'tomato': 1}
food = ChainMap(fruits, veggies)
food.maps # get all contents
[{'apple': 2, 'tomato': 1}, {'carrot': 3, 'tomato': 1}]
list(food.keys()) # Get keys
['carrot', 'tomato', 'apple']
list(food.values()) # Get values
[3, 1, 2]