Alternative Approach
Contents
7.1. Alternative Approach#
This section covers some alternatives approaches to work with Python.
7.1.1. Box: Using Dot Notation to Access Keys in a Python Dictionary#
!pip install python-box[all]
Do you wish to use dict.key
instead of dict['key']
to access the values inside a Python dictionary? If so, try Box.
Box is like a Python dictionary except that it allows you to access keys using dot notation. This makes the code cleaner when you want to access a key inside a nested dictionary like below.
from box import Box
food_box = Box({"food": {"fruit": {"name": "apple", "flavor": "sweet"}}})
print(food_box)
{'food': {'fruit': {'name': 'apple', 'flavor': 'sweet'}}}
print(food_box.food.fruit.name)
apple
7.1.2. decorator module: Write Shorter Python Decorators without Nested Functions#
!pip install decorator
Have you ever wished to write a Python decorator with only one function instead of nested functions like below?
from time import time, sleep
def time_func_complex(func):
def wrapper(*args, **kwargs):
start_time = time()
func(*args, **kwargs)
end_time = time()
print(
f"""It takes {round(end_time - start_time, 3)} seconds to execute the function"""
)
return wrapper
@time_func_complex
def test_func_complex():
sleep(1)
test_func_complex()
It takes 1.001 seconds to execute the function
If so, try decorator. In the code below, time_func_simple
produces the exact same results as time_func_complex
, but time_func_simple
is easier and short to write.
from decorator import decorator
@decorator
def time_func_simple(func, *args, **kwargs):
start_time = time()
func(*args, **kwargs)
end_time = time()
print(
f"""It takes {round(end_time - start_time, 3)} seconds to execute the function"""
)
@time_func_simple
def test_func_simple():
sleep(1)
test_func_simple()
It takes 1.001 seconds to execute the function
7.1.3. Pipe: Use Inflix Notation in Python#
!pip install pipe
Normally, you might use nested parentheses like below to combine multiple functions.
nums = [1, 2, 3, 4, 5, 6]
list(
filter(lambda x: x % 2 == 0,
map(lambda x: x ** 2, nums)
)
)
[4, 16, 36]
If you want to increase the readability of your code by using pipes, try the library pipe. Below is an example using this library.
from pipe import select, where
list(
nums
| select(lambda x: x ** 2)
| where(lambda x: x % 2 == 0)
)
[4, 16, 36]
7.1.4. PRegEx: Write Human-Readable Regular Expressions#
!pip install pregex
RegEx is useful for extracting words with matching patterns. However, it can be difficult to read and create. PregEx allows you to write a more human-readable RegEx.
In the code below, I use PregEx to extract URLs from text.
from pregex.core.classes import AnyButWhitespace
from pregex.core.quantifiers import OneOrMore, Optional
from pregex.core.operators import Either
text = "You can find me through my website mathdatasimplified.com/ or GitHub https://github.com/khuyentran1401"
any_but_space = OneOrMore(AnyButWhitespace())
optional_scheme = Optional("https://")
domain = Either(".com", ".org")
pre = (
optional_scheme
+ any_but_space
+ domain
+ any_but_space
)
pre.get_pattern()
'(?:https:\\/\\/)?\\S+(?:\\.com|\\.org)\\S+'
pre.get_matches(text)
['mathdatasimplified.com/', 'https://github.com/khuyentran1401']
7.1.5. parse: Extract Strings Using Brackets#
!pip install parse
If you want to extract substrings from a string, but find it challenging to do so with RegEx, try parse. parse makes it easy to extract strings that are inside brackets.
from parse import parse
# Get strings in the brackets
parse("I'll get some {} from {}", "I'll get some apples from Aldi")
<Result ('apples', 'Aldi') {}>
You can also make the brackets more readable by adding the field name to them.
# Specify the field names for the brackets
parse("I'll get some {items} from {store}", "I'll get some shirts from Walmart")
<Result () {'items': 'shirts', 'store': 'Walmart'}>
parse also allows you to get the string with a certain format.
# Get a digit and a word
r = parse("I saw {number:d} {animal:w}s", "I saw 3 deers")
r
<Result () {'number': 3, 'animal': 'deer'}>
r['number']
3
7.1.6. Dictdiffer: Find the Differences Between Two Dictionaries#
!pip install dictdiffer
When comparing two complicated dictionaries, it is useful to have a tool that finds the differences between the two. Dictdiffer allows you to do exactly that.
from dictdiffer import diff, swap
user1 = {
"name": "Ben",
"age": 25,
"fav_foods": ["ice cream"],
}
user2 = {
"name": "Josh",
"age": 25,
"fav_foods": ["ice cream", "chicken"],
}
# find the difference between two dictionaries
result = diff(user1, user2)
list(result)
[('change', 'name', ('Ben', 'Josh')), ('add', 'fav_foods', [(1, 'chicken')])]
# swap the diff result
result = diff(user1, user2)
swapped = swap(result)
list(swapped)
[('change', 'name', ('Josh', 'Ben')),
('remove', 'fav_foods', [(1, 'chicken')])]
7.1.7. Map a Function Asynchronously with Prefect#
!pip install -U prefect
map
runs a function for each item in an iterable synchronously.
def add_one(x):
sleep(2)
return x + 1
def sync_map():
b = [add_one(item) for item in [1, 2, 3]]
sync_map()
To speed up the execution, map a function asynchronously with Prefect.
from prefect import flow, task
from time import sleep
import warnings
warnings.simplefilter("ignore", UserWarning)
# Create a task
@task
def add_one(x):
sleep(2)
return x + 1
# Create a flow
@flow
def async_map():
# Run a task for each element in the iterable
b = add_one.map([1, 2, 3])
async_map()
09:04:49.018 | INFO | prefect.engine - Created flow run 'copper-sponge' for flow 'async-map'
09:04:51.144 | INFO | Flow run 'copper-sponge' - Created task run 'add_one-3c3112ef-1' for task 'add_one'
09:04:51.148 | INFO | Flow run 'copper-sponge' - Submitted task run 'add_one-3c3112ef-1' for execution.
09:04:51.191 | INFO | Flow run 'copper-sponge' - Created task run 'add_one-3c3112ef-0' for task 'add_one'
09:04:51.192 | INFO | Flow run 'copper-sponge' - Submitted task run 'add_one-3c3112ef-0' for execution.
09:04:51.208 | INFO | Flow run 'copper-sponge' - Created task run 'add_one-3c3112ef-2' for task 'add_one'
09:04:51.210 | INFO | Flow run 'copper-sponge' - Submitted task run 'add_one-3c3112ef-2' for execution.
09:04:54.321 | INFO | Task run 'add_one-3c3112ef-0' - Finished in state Completed()
09:04:54.362 | INFO | Task run 'add_one-3c3112ef-2' - Finished in state Completed()
09:04:54.380 | INFO | Task run 'add_one-3c3112ef-1' - Finished in state Completed()
09:04:54.685 | INFO | Flow run 'copper-sponge' - Finished in state Completed('All states completed.')
[Completed(message=None, type=COMPLETED, result=PersistedResult(type='reference', serializer_type='pickle', storage_block_id=UUID('45e1a1fc-bdc8-4f8d-8945-287d12b46d33'), storage_key='ad7912161ab44a6d8359f8089a16202d')),
Completed(message=None, type=COMPLETED, result=PersistedResult(type='reference', serializer_type='pickle', storage_block_id=UUID('45e1a1fc-bdc8-4f8d-8945-287d12b46d33'), storage_key='fe83574cd0df4fc5838ef902beb34f6b')),
Completed(message=None, type=COMPLETED, result=PersistedResult(type='reference', serializer_type='pickle', storage_block_id=UUID('45e1a1fc-bdc8-4f8d-8945-287d12b46d33'), storage_key='ba18fe9c568845ecbad03c25df353655'))]
Prefect is an open-source library that allows you to orchestrate and observe your data pipelines defined in Python. Check out the getting started tutorials for basic concepts of Prefect.