7.1. Alternative Approach#

This section covers some alternatives approaches to work with Python.

7.1.1. Simplify Null Checks in Python with the Maybe Container#

Hide code cell content
!pip install returns
Hide code cell content
from typing import Optional


class Event:
    def __init__(self, ticket: Ticket) -> None:
        self._ticket = ticket

    def get_ticket(self) -> Ticket:
        return self._ticket


class Ticket:
    def __init__(self, price: float) -> None:
        self._price = price

    def get_price(self) -> float:
        return self._price


class Discount:
    def __init__(self, discount_amount: float):
        self.discount_amount = discount_amount

    def apply_discount(self, price: float) -> float:
        return price - self.discount_amount

Having multiple if x is not None: conditions can make the code deeply nested and unreadable.

def calculate_discounted_price(
    event: Optional[Event] = None, discount: Optional[Discount] = None
) -> Optional[float]:
    if event is not None:
        ticket = event.get_ticket()
        if ticket is not None:
            price = ticket.get_price()
            if discount is not None:
                return discount.apply_discount(price)
    return None


ticket = Ticket(100)
concert = Event(ticket)
discount = Discount(20)
calculate_discounted_price(concert, discount)
80
calculate_discounted_price()

The Maybe container from the returns library enhances code clarity through the bind_optional method, which applies a function to the result of the previous step only when that result is not None.

from returns.maybe import Maybe


def calculate_discounted_price(
    event: Optional[Event] = None, discount: Optional[Discount] = None
) -> Maybe[float]:
    return (
        Maybe.from_optional(event)
        .bind_optional(lambda event: event.get_ticket()) # called only when event exists
        .bind_optional(lambda ticket: ticket.get_price()) # called only when ticket exists
        .bind_optional(lambda price: discount.apply_discount(price)) # called only when price exists
    )

ticket = Ticket(100)
concert = Event(ticket)
discount = Discount(20)
calculate_discounted_price(concert, discount)
<Some: 80>
calculate_discounted_price()
<Nothing>

Link to returns.

7.1.2. Box: Using Dot Notation to Access Keys in a Python Dictionary#

Hide code cell content
!pip install python-box[all]

Do you wish to use dict.key instead of dict['key'] to access the values inside a Python dictionary? If so, try Box.

Box is like a Python dictionary except that it allows you to access keys using dot notation. This makes the code cleaner when you want to access a key inside a nested dictionary like below.

from box import Box

food_box = Box({"food": {"fruit": {"name": "apple", "flavor": "sweet"}}})
print(food_box)
{'food': {'fruit': {'name': 'apple', 'flavor': 'sweet'}}}
print(food_box.food.fruit.name)
apple

Link to Box.

7.1.3. decorator module: Write Shorter Python Decorators without Nested Functions#

Hide code cell content
!pip install decorator

Have you ever wished to write a Python decorator with only one function instead of nested functions like below?

from time import time, sleep


def time_func_complex(func):
    def wrapper(*args, **kwargs):
        start_time = time()
        func(*args, **kwargs)
        end_time = time()
        print(
            f"""It takes {round(end_time - start_time, 3)} seconds to execute the function"""
        )

    return wrapper


@time_func_complex
def test_func_complex():
    sleep(1)


test_func_complex()
It takes 1.001 seconds to execute the function

If so, try decorator. In the code below, time_func_simple produces the exact same results as time_func_complex, but time_func_simple is easier and short to write.

from decorator import decorator


@decorator
def time_func_simple(func, *args, **kwargs):
    start_time = time()
    func(*args, **kwargs)
    end_time = time()
    print(
        f"""It takes {round(end_time - start_time, 3)} seconds to execute the function"""
    )


@time_func_simple
def test_func_simple():
    sleep(1)


test_func_simple()
It takes 1.001 seconds to execute the function

Check out other things the decorator library can do.

7.1.4. Pipe: Use Inflix Notation in Python#

Hide code cell content
!pip install pipe

Normally, you might use nested parentheses like below to combine multiple functions.

nums = [1, 2, 3, 4, 5, 6]
list(
    filter(lambda x: x % 2 == 0, 
            map(lambda x: x ** 2, nums)
          )
)
[4, 16, 36]

If you want to increase the readability of your code by using pipes, try the library pipe. Below is an example using this library.

from pipe import select, where
list(
    nums
    | select(lambda x: x ** 2)
    | where(lambda x: x % 2 == 0)
)
[4, 16, 36]

Link to my article on pipe.

Link to pipe.

7.1.5. PRegEx: Write Human-Readable Regular Expressions#

Hide code cell content
!pip install pregex

RegEx is useful for extracting words with matching patterns. However, it can be difficult to read and create. PregEx allows you to write a more human-readable RegEx.

In the code below, I use PregEx to extract URLs from text.

from pregex.core.classes import AnyButWhitespace
from pregex.core.quantifiers import OneOrMore, Optional
from pregex.core.operators import Either


text = "You can find me through my website mathdatasimplified.com/ or GitHub https://github.com/khuyentran1401"

any_but_space = OneOrMore(AnyButWhitespace())
optional_scheme = Optional("https://")
domain = Either(".com", ".org")

pre = (
    optional_scheme
    + any_but_space
    + domain
    + any_but_space
)

pre.get_pattern()
'(?:https:\\/\\/)?\\S+(?:\\.com|\\.org)\\S+'
pre.get_matches(text)  
['mathdatasimplified.com/', 'https://github.com/khuyentran1401']

Full article about PregEx.

Link to PregEx.

7.1.6. parse: Extract Strings Using Brackets#

Hide code cell content
!pip install parse

If you want to extract substrings from a string, but find it challenging to do so with RegEx, try parse. parse makes it easy to extract strings that are inside brackets.

from parse import parse 

# Get strings in the brackets
parse("I'll get some {} from {}", "I'll get some apples from Aldi")
<Result ('apples', 'Aldi') {}>

You can also make the brackets more readable by adding the field name to them.

# Specify the field names for the brackets
parse("I'll get some {items} from {store}", "I'll get some shirts from Walmart")
<Result () {'items': 'shirts', 'store': 'Walmart'}>

parse also allows you to get the string with a certain format.

# Get a digit and a word
r = parse("I saw {number:d} {animal:w}s", "I saw 3 deers")
r
<Result () {'number': 3, 'animal': 'deer'}>
r['number']
3

Link to parse.

7.1.7. Simplify Pattern Matching and Transformation in Python with Pampy#

Hide code cell content
!pip install pampy

To simplify extracting and modifying complex Python objects, use Pampy. Pampy enables pattern matching across a variety of Python objects, including lists, dictionaries, tuples, and classes.

from pampy import match, HEAD, TAIL, _

nums = [1, 2, 3]
match(nums, [1, 2, _], lambda num: f"It's {num}")
"It's 3"
match(nums, [1, TAIL], lambda t: t)
[2, 3]
nums = [1, [2, 3], 4]

match(nums, [1, [_, 3], _], lambda a, b: [1, a, 3, b])
[1, 2, 3, 4]
pet = {"type": "dog", "details": {"age": 3}}

match(pet, {"details": {"age": _}}, lambda age: age)
3

Link to Pampy.

7.1.8. Dictdiffer: Find the Differences Between Two Dictionaries#

Hide code cell content
!pip install dictdiffer

When comparing two complicated dictionaries, it is useful to have a tool that finds the differences between the two. Dictdiffer allows you to do exactly that.

from dictdiffer import diff, swap

user1 = {
    "name": "Ben", 
    "age": 25, 
    "fav_foods": ["ice cream"],
}

user2 = {
    "name": "Josh",
    "age": 25,
    "fav_foods": ["ice cream", "chicken"],
}
# find the difference between two dictionaries
result = diff(user1, user2)
list(result)
[('change', 'name', ('Ben', 'Josh')), ('add', 'fav_foods', [(1, 'chicken')])]
# swap the diff result
result = diff(user1, user2)
swapped = swap(result)
list(swapped)
[('change', 'name', ('Josh', 'Ben')),
 ('remove', 'fav_foods', [(1, 'chicken')])]

Link to Dictdiffer.

7.1.9. unyt: Manipulate and Convert Units in NumPy Arrays#

Hide code cell content
!pip install unyt 

Working with NumPy arrays that have units can be difficult, as it is not immediately clear what the units are, which can lead to errors.

The unyt package solves this by providing a subclass of NumPy’s ndarray class that knows units.

import numpy as np

temps = np.array([25, 30, 35, 40])

temps_f = (temps * 9/5) + 32
print(temps_f)
[ 77.  86.  95. 104.]
from unyt import degC, degF

# Create an array of temperatures in Celsius
temps = np.array([25, 30, 35, 40]) * degC

# Convert the temperatures to Fahrenheit
temps_f = temps.to(degF)
print(temps_f)
[ 77.  86.  95. 104.] °F

unyt arrays support standard NumPy array operations and functions while also preserving the units associated with the data.

temps_f.reshape(2, 2)
unyt_array([[ 77., 572.],
            [ 95., 104.]], 'degF')

Link to unyt.

7.1.10. Map a Function Asynchronously with Prefect#

Hide code cell content
!pip install -U prefect 

map runs a function for each item in an iterable synchronously.

def add_one(x):
    sleep(2)
    return x + 1

def sync_map():
    b = [add_one(item) for item in [1, 2, 3]]

sync_map()

To speed up the execution, map a function asynchronously with Prefect.

from prefect import flow, task
from time import sleep
import warnings

warnings.simplefilter("ignore", UserWarning)

# Create a task
@task
def add_one(x):
    sleep(2)
    return x + 1

# Create a flow
@flow
def async_map():
    # Run a task for each element in the iterable
    b = add_one.map([1, 2, 3])


async_map()
Hide code cell output
09:04:49.018 | INFO    | prefect.engine - Created flow run 'copper-sponge' for flow 'async-map'
09:04:51.144 | INFO    | Flow run 'copper-sponge' - Created task run 'add_one-3c3112ef-1' for task 'add_one'
09:04:51.148 | INFO    | Flow run 'copper-sponge' - Submitted task run 'add_one-3c3112ef-1' for execution.
09:04:51.191 | INFO    | Flow run 'copper-sponge' - Created task run 'add_one-3c3112ef-0' for task 'add_one'
09:04:51.192 | INFO    | Flow run 'copper-sponge' - Submitted task run 'add_one-3c3112ef-0' for execution.
09:04:51.208 | INFO    | Flow run 'copper-sponge' - Created task run 'add_one-3c3112ef-2' for task 'add_one'
09:04:51.210 | INFO    | Flow run 'copper-sponge' - Submitted task run 'add_one-3c3112ef-2' for execution.
09:04:54.321 | INFO    | Task run 'add_one-3c3112ef-0' - Finished in state Completed()
09:04:54.362 | INFO    | Task run 'add_one-3c3112ef-2' - Finished in state Completed()
09:04:54.380 | INFO    | Task run 'add_one-3c3112ef-1' - Finished in state Completed()
09:04:54.685 | INFO    | Flow run 'copper-sponge' - Finished in state Completed('All states completed.')
[Completed(message=None, type=COMPLETED, result=PersistedResult(type='reference', serializer_type='pickle', storage_block_id=UUID('45e1a1fc-bdc8-4f8d-8945-287d12b46d33'), storage_key='ad7912161ab44a6d8359f8089a16202d')),
 Completed(message=None, type=COMPLETED, result=PersistedResult(type='reference', serializer_type='pickle', storage_block_id=UUID('45e1a1fc-bdc8-4f8d-8945-287d12b46d33'), storage_key='fe83574cd0df4fc5838ef902beb34f6b')),
 Completed(message=None, type=COMPLETED, result=PersistedResult(type='reference', serializer_type='pickle', storage_block_id=UUID('45e1a1fc-bdc8-4f8d-8945-287d12b46d33'), storage_key='ba18fe9c568845ecbad03c25df353655'))]

Prefect is an open-source library that allows you to orchestrate and observe your data pipelines defined in Python. Check out the getting started tutorials for basic concepts of Prefect.