7.1. Alternative Approach#

This section covers some alternatives approaches to work with Python.

7.1.1. Box: Using Dot Notation to Access Keys in a Python Dictionary#

!pip install python-box[all]

Do you wish to use dict.key instead of dict['key'] to access the values inside a Python dictionary? If so, try Box.

Box is like a Python dictionary except that it allows you to access keys using dot notation. This makes the code cleaner when you want to access a key inside a nested dictionary like below.

from box import Box

food_box = Box({"food": {"fruit": {"name": "apple", "flavor": "sweet"}}})
print(food_box)
{'food': {'fruit': {'name': 'apple', 'flavor': 'sweet'}}}
print(food_box.food.fruit.name)
apple

Link to Box.

7.1.2. decorator module: Write Shorter Python Decorators without Nested Functions#

!pip install decorator

Have you ever wished to write a Python decorator with only one function instead of nested functions like below?

from time import time, sleep


def time_func_complex(func):
    def wrapper(*args, **kwargs):
        start_time = time()
        func(*args, **kwargs)
        end_time = time()
        print(
            f"""It takes {round(end_time - start_time, 3)} seconds to execute the function"""
        )

    return wrapper


@time_func_complex
def test_func_complex():
    sleep(1)


test_func_complex()
It takes 1.001 seconds to execute the function

If so, try decorator. In the code below, time_func_simple produces the exact same results as time_func_complex, but time_func_simple is easier and short to write.

from decorator import decorator


@decorator
def time_func_simple(func, *args, **kwargs):
    start_time = time()
    func(*args, **kwargs)
    end_time = time()
    print(
        f"""It takes {round(end_time - start_time, 3)} seconds to execute the function"""
    )


@time_func_simple
def test_func_simple():
    sleep(1)


test_func_simple()
It takes 1.001 seconds to execute the function

Check out other things the decorator library can do.

7.1.3. Pipe: Use Inflix Notation in Python#

!pip install pipe

Normally, you might use nested parentheses like below to combine multiple functions.

nums = [1, 2, 3, 4, 5, 6]
list(
    filter(lambda x: x % 2 == 0, 
            map(lambda x: x ** 2, nums)
          )
)
[4, 16, 36]

If you want to increase the readability of your code by using pipes, try the library pipe. Below is an example using this library.

from pipe import select, where
list(
    nums
    | select(lambda x: x ** 2)
    | where(lambda x: x % 2 == 0)
)
[4, 16, 36]

Link to my article on pipe.

Link to pipe.

7.1.4. PRegEx: Write Human-Readable Regular Expressions#

!pip install pregex

RegEx is useful for extracting words with matching patterns. However, it can be difficult to read and create. PregEx allows you to write a more human-readable RegEx.

In the code below, I use PregEx to extract URLs from text.

from pregex.core.classes import AnyButWhitespace
from pregex.core.quantifiers import OneOrMore, Optional
from pregex.core.operators import Either


text = "You can find me through my website mathdatasimplified.com/ or GitHub https://github.com/khuyentran1401"

any_but_space = OneOrMore(AnyButWhitespace())
optional_scheme = Optional("https://")
domain = Either(".com", ".org")

pre = (
    optional_scheme
    + any_but_space
    + domain
    + any_but_space
)

pre.get_pattern()
'(?:https:\\/\\/)?\\S+(?:\\.com|\\.org)\\S+'
pre.get_matches(text)  
['mathdatasimplified.com/', 'https://github.com/khuyentran1401']

Full article about PregEx.

Link to PregEx.

7.1.5. parse: Extract Strings Using Brackets#

!pip install parse

If you want to extract substrings from a string, but find it challenging to do so with RegEx, try parse. parse makes it easy to extract strings that are inside brackets.

from parse import parse 

# Get strings in the brackets
parse("I'll get some {} from {}", "I'll get some apples from Aldi")
<Result ('apples', 'Aldi') {}>

You can also make the brackets more readable by adding the field name to them.

# Specify the field names for the brackets
parse("I'll get some {items} from {store}", "I'll get some shirts from Walmart")
<Result () {'items': 'shirts', 'store': 'Walmart'}>

parse also allows you to get the string with a certain format.

# Get a digit and a word
r = parse("I saw {number:d} {animal:w}s", "I saw 3 deers")
r
<Result () {'number': 3, 'animal': 'deer'}>
r['number']
3

Link to parse.

7.1.6. Dictdiffer: Find the Differences Between Two Dictionaries#

!pip install dictdiffer

When comparing two complicated dictionaries, it is useful to have a tool that finds the differences between the two. Dictdiffer allows you to do exactly that.

from dictdiffer import diff, swap

user1 = {
    "name": "Ben", 
    "age": 25, 
    "fav_foods": ["ice cream"],
}

user2 = {
    "name": "Josh",
    "age": 25,
    "fav_foods": ["ice cream", "chicken"],
}
# find the difference between two dictionaries
result = diff(user1, user2)
list(result)
[('change', 'name', ('Ben', 'Josh')), ('add', 'fav_foods', [(1, 'chicken')])]
# swap the diff result
result = diff(user1, user2)
swapped = swap(result)
list(swapped)
[('change', 'name', ('Josh', 'Ben')),
 ('remove', 'fav_foods', [(1, 'chicken')])]

Link to Dictdiffer.

7.1.7. Map a Function Asynchronously with Prefect#

!pip install -U prefect 

map runs a function for each item in an iterable synchronously.

def add_one(x):
    sleep(2)
    return x + 1

def sync_map():
    b = [add_one(item) for item in [1, 2, 3]]

sync_map()

To speed up the execution, map a function asynchronously with Prefect.

from prefect import flow, task
from time import sleep
import warnings

warnings.simplefilter("ignore", UserWarning)

# Create a task
@task
def add_one(x):
    sleep(2)
    return x + 1

# Create a flow
@flow
def async_map():
    # Run a task for each element in the iterable
    b = add_one.map([1, 2, 3])


async_map()
09:04:49.018 | INFO    | prefect.engine - Created flow run 'copper-sponge' for flow 'async-map'
09:04:51.144 | INFO    | Flow run 'copper-sponge' - Created task run 'add_one-3c3112ef-1' for task 'add_one'
09:04:51.148 | INFO    | Flow run 'copper-sponge' - Submitted task run 'add_one-3c3112ef-1' for execution.
09:04:51.191 | INFO    | Flow run 'copper-sponge' - Created task run 'add_one-3c3112ef-0' for task 'add_one'
09:04:51.192 | INFO    | Flow run 'copper-sponge' - Submitted task run 'add_one-3c3112ef-0' for execution.
09:04:51.208 | INFO    | Flow run 'copper-sponge' - Created task run 'add_one-3c3112ef-2' for task 'add_one'
09:04:51.210 | INFO    | Flow run 'copper-sponge' - Submitted task run 'add_one-3c3112ef-2' for execution.
09:04:54.321 | INFO    | Task run 'add_one-3c3112ef-0' - Finished in state Completed()
09:04:54.362 | INFO    | Task run 'add_one-3c3112ef-2' - Finished in state Completed()
09:04:54.380 | INFO    | Task run 'add_one-3c3112ef-1' - Finished in state Completed()
09:04:54.685 | INFO    | Flow run 'copper-sponge' - Finished in state Completed('All states completed.')
[Completed(message=None, type=COMPLETED, result=PersistedResult(type='reference', serializer_type='pickle', storage_block_id=UUID('45e1a1fc-bdc8-4f8d-8945-287d12b46d33'), storage_key='ad7912161ab44a6d8359f8089a16202d')),
 Completed(message=None, type=COMPLETED, result=PersistedResult(type='reference', serializer_type='pickle', storage_block_id=UUID('45e1a1fc-bdc8-4f8d-8945-287d12b46d33'), storage_key='fe83574cd0df4fc5838ef902beb34f6b')),
 Completed(message=None, type=COMPLETED, result=PersistedResult(type='reference', serializer_type='pickle', storage_block_id=UUID('45e1a1fc-bdc8-4f8d-8945-287d12b46d33'), storage_key='ba18fe9c568845ecbad03c25df353655'))]

Prefect is an open-source library that allows you to orchestrate and observe your data pipelines defined in Python. Check out the getting started tutorials for basic concepts of Prefect.