6.10. Visualization

This section covers some tools to visualize your data and model.

6.10.1. Graphviz: Create a Flowchart to Capture Your Ideas in Python

A flowchart is helpful for summarizing and visualizing your workflow. This also helps your team understand your workflow. Wouldn’t it be nice if you could create a flowchart using Python?

Graphviz makes it easy to create a flowchart like below.

!pip install graphviz
from graphviz import Graph 

# Instantiate a new Graph object
dot = Graph('Data Science Process', format='png')

# Add nodes
dot.node('A', 'Get Data')
dot.node('B', 'Clean, Prepare, & Manipulate Data')
dot.node('C', 'Train Model')
dot.node('D', 'Test Data')
dot.node('E', 'Improve')

# Connect these nodes
dot.edges(['AB', 'BC', 'CD', 'DE'])

# Save chart
dot.render('data_science_flowchart', view=True)
'data_science_flowchart.png'
dot 
../_images/visualization_6_0.svg

Link to graphviz

6.10.2. folium: Create an Interactive Map in Python

!pip install folium

If you want to create a map provided the location in a few lines of code, try folium. Folium is a Python library that allows you to create an interactive map.

import folium
m = folium.Map(location=[45.5236, -122.6750])

tooltip = 'Click me!'
folium.Marker([45.3288, -121.6625], popup='<i>Mt. Hood Meadows</i>',
              tooltip=tooltip).add_to(m)
m 
Make this Notebook Trusted to load map: File -> Trust Notebook

View the document of folium here.

I used this library to view the locations of the owners of top machine learning repositories. Pretty cool to see their locations through an interactive map.

6.10.3. dtreeviz: Visualize and Interpret a Decision Tree Model

!pip install dtreeviz

If you want to find an easy way to visualize and interpret a decision tree model, use dtreeviz.

from dtreeviz.trees import dtreeviz
from sklearn import tree
from sklearn.datasets import load_wine

wine = load_wine()
classifier = tree.DecisionTreeClassifier(max_depth=2)
classifier.fit(wine.data, wine.target)

vis = dtreeviz(
    classifier,
    wine.data,
    wine.target,
    target_name="wine_type",
    feature_names=wine.feature_names,
)

vis.view()
findfont: Font family ['Arial'] not found. Falling back to DejaVu Sans.
findfont: Font family ['Arial'] not found. Falling back to DejaVu Sans.
findfont: Font family ['Arial'] not found. Falling back to DejaVu Sans.
findfont: Font family ['Arial'] not found. Falling back to DejaVu Sans.

The image below shows the output of dtreeviz when applying it on DecisionTreeClassifier.

image

Link to dtreeviz.

6.10.4. HiPlot - High Dimensional Interactive Plotting

!pip install hiplot 

If you are tuning hyperparameters of your machine learning model, it can be difficult to understand the relationships between different combinations of hyperparameters and a specific metric.

That is when HiPlot comes in handy. HiPlot allows you to discover patterns in high-dimensional data using parallel plots like below.

import hiplot as hip
data = [{'lr': 0.001, 'loss': 10.0, 'r2': 0.8, 'optimizer': 'SGD'},
        {'lr': 0.01, 'loss': 2.5, 'r2': 0.9, 'optimizer': 'Adam'},
        {'lr': 0.1, 'loss': 4, 'r2': 0.86, 'optimizer': 'Adam'}]
hip.Experiment.from_iterable(data).display()