4.9. Sort Rows or Columns of a DataFrame#

4.9.1. set_categories in Pandas: Sort Categorical Column by a Specific Ordering#

If you want to sort pandas DataFrame’s categorical column by a specific ordering such as small, medium, large, use df.col.cat.set_categories() method.

import pandas as pd 

df = pd.DataFrame(
    {"col1": ["large", "small", "mini", "medium", "mini"], "col2": [1, 2, 3, 4, 5]}
)
ordered_sizes = "large", "medium", "small", "mini"

df.col1 = df.col1.astype("category")
df.col1.cat.set_categories(ordered_sizes, ordered=True, inplace=True)
df.sort_values(by="col1")
/home/khuyen/book/venv/lib/python3.8/site-packages/pandas/core/arrays/categorical.py:2630: FutureWarning: The `inplace` parameter in pandas.Categorical.set_categories is deprecated and will be removed in a future version. Removing unused categories will always return a new Categorical object.
  res = method(*args, **kwargs)
col1 col2
0 large 1
3 medium 4
1 small 2
2 mini 3
4 mini 5