Status
Done
TABLE OF CONTENT
1. Core Categorical Properties (via .cat accessor)
df['cat_series'].cat.codes→ Returns integer codes for each categorydf['cat_series'].cat.categories→ Returns the index of categoriesdf['cat_series'].cat.ordered→ Returns True if categories have logical ordering
2. Category Management
df['cat_series'].cat.rename_categories(new_names)→ Rename categoriesdf['cat_series'].cat.reorder_categories(new_order)→ Reorder categoriesdf['cat_series'].cat.add_categories(new_cats)→ Add new categoriesdf['cat_series'].cat.remove_categories(to_remove)→ Remove specific categoriesdf['cat_series'].cat.remove_unused_categories()→ Remove unused categoriesdf['cat_series'].cat.set_categories(new_cats)→ Set new categories (removes others)
3. Order Control
df['cat_series'].cat.as_ordered()→ Set categories to be ordereddf['cat_series'].cat.as_unordered()→ Remove ordering
Working with Categorical Data in Pandas
Let's create a sample dataset to demonstrate categorical operations:
python
import pandas as pd
data = {
'product_id': [101, 102, 103, 104, 105, 106, 107],
'product_name': ['Laptop', 'Tablet', 'Phone', 'Monitor', 'Keyboard', 'Mouse', 'Headphones'],
'category': ['Electronics', 'Electronics', 'Electronics', 'Electronics', 'Accessories', 'Accessories', 'Accessories'],
'size': ['Medium', 'Small', 'Small', 'Large', 'Small', 'Small', 'Medium'],
'priority': ['High', 'Medium', 'High', 'Low', 'Medium', 'Low', 'High']
}
df = pd.DataFrame(data)
# Convert to categorical
df['category'] = df['category'].astype('category')
df['size'] = df['size'].astype('category')
df['priority'] = df['priority'].astype('category')‣
1. Core Categorical Properties
‣
2. Category Management
‣
3. Order Control
‣
Practical Applications
‣
Best Practices
‣
Common Pitfalls
‣