4.9. Sort Rows or Columns of a DataFrame#
4.9.1. set_categories in Pandas: Sort Categorical Column by a Specific Ordering#
If you want to sort pandas DataFrame’s categorical column by a specific ordering such as small, medium, large, use df.col.cat.set_categories()
method.
import pandas as pd
df = pd.DataFrame(
{"col1": ["large", "small", "mini", "medium", "mini"], "col2": [1, 2, 3, 4, 5]}
)
ordered_sizes = "large", "medium", "small", "mini"
df.col1 = df.col1.astype("category")
df.col1.cat.set_categories(ordered_sizes, ordered=True, inplace=True)
df.sort_values(by="col1")
/home/khuyen/book/venv/lib/python3.8/site-packages/pandas/core/arrays/categorical.py:2630: FutureWarning: The `inplace` parameter in pandas.Categorical.set_categories is deprecated and will be removed in a future version. Removing unused categories will always return a new Categorical object.
res = method(*args, **kwargs)
col1 | col2 | |
---|---|---|
0 | large | 1 |
3 | medium | 4 |
1 | small | 2 |
2 | mini | 3 |
4 | mini | 5 |