Hyperparameter Tuning Guide: Updated Practice and Reference Note
This guide provides a comprehensive list of hyperparameters for tree-based models, organized by their functionality and tailored for effective tuning in machine learning projects.
1. General Workflow for Hyperparameter Tuning
- Understand Your Model Type:
 - Bagging models (e.g., Random Forest) focus on independent tree creation.
 - Boosting models (e.g., XGBoost, LightGBM, CatBoost) involve sequential learning.
 - Key Objectives:
 - Prevent overfitting (when the model is too complex).
 - Prevent underfitting (when the model is too simple).
 - Balance model complexity, generalization, and training time.
 - Steps:
 - Start with default hyperparameters.
 - Identify key parameters to tune based on model type and dataset size.
 - Use GridSearchCV or RandomizedSearchCV for systematic exploration.
 
2. Parameters for Each Model Type
A. Random Forest / Extra Trees / Bagging Models
These models do not use learning_rate. Focus on parameters related to tree complexity and sampling.
Parameter  | Purpose  | Example Values  | 
n_estimators | Number of trees in the ensemble.  | [10, 50, 100, 200] | 
max_depth | Maximum depth of each tree to control overfitting.  | [3, 5, 10, None] | 
min_samples_split | Minimum number of samples required to split a node.  | [2, 5, 10] | 
min_samples_leaf | Minimum number of samples required to be at a leaf node.  | [1, 2, 4, 10] | 
max_features | Number of features to consider for each split ( 'sqrt', 'log2', or a fraction). | ['sqrt', 'log2', None] | 
bootstrap | Whether to use bootstrap samples.  | [True, False] | 
criterion | Splitting criterion for the tree ( 'gini' for classification, 'mse' for regression). | ['gini', 'entropy'] | 
B. Gradient Boosting Models (e.g., XGBoost, LightGBM, CatBoost)
These models use learning_rate and require careful balancing between tree complexity, step size, and regularization.
Parameter  | Purpose  | Example Values  | 
n_estimators | Number of boosting iterations (trees).  | [100, 200, 300, 500] | 
learning_rate | Step size for updating weights of weak learners.  | [0.01, 0.05, 0.1, 0.2] | 
max_depth | Maximum depth of individual trees.  | [3, 5, 7, 10] | 
subsample | Fraction of samples used to train each tree (to avoid overfitting).  | [0.6, 0.8, 1.0] | 
colsample_bytree | Fraction of features to consider for each tree.  | [0.6, 0.8, 1.0] | 
colsample_bylevel | Fraction of features to consider for each split within a tree.  | [0.6, 0.8, 1.0] | 
colsample_bynode | Fraction of features to consider at each tree node (XGBoost only).  | [0.6, 0.8, 1.0] | 
gamma | Minimum loss reduction required to make a further split (regularization).  | [0, 1, 5] | 
reg_alpha | L1 regularization term for weights (lasso).  | [0, 0.1, 1] | 
reg_lambda | L2 regularization term for weights (ridge).  | [0, 0.1, 1] | 
min_child_weight | Minimum sum of instance weights needed in a child node (regularization).  | [1, 5, 10] | 
tree_method | Tree construction algorithm ( 'auto', 'exact', 'approx', 'hist', 'gpu_hist'). | ['auto', 'hist'] | 
C. LightGBM-Specific Parameters
LightGBM has unique parameters like num_leaves and feature bagging.
Parameter  | Purpose  | Example Values  | 
num_leaves | Maximum number of leaf nodes in a tree (controls model complexity).  | [15, 31, 63, 127] | 
max_depth | Maximum depth of the tree.  | [3, 5, 10, -1] | 
min_data_in_leaf | Minimum number of samples in a leaf to prevent overfitting.  | [20, 50, 100] | 
feature_fraction | Fraction of features used for each tree.  | [0.6, 0.8, 1.0] | 
bagging_fraction | Fraction of data used for bagging.  | [0.6, 0.8, 1.0] | 
bagging_freq | Frequency of bagging (0 means no bagging).  | [0, 5, 10] | 
lambda_l1 | L1 regularization term on weights.  | [0, 0.1, 1] | 
lambda_l2 | L2 regularization term on weights.  | [0, 0.1, 1] | 
boosting_type | Boosting algorithm ( 'gbdt', 'dart', 'goss'). | ['gbdt', 'dart'] | 
D. CatBoost-Specific Parameters
CatBoost has specialized handling for categorical features and its own hyperparameters.
Parameter  | Purpose  | Example Values  | 
depth | Depth of the tree (similar to  max_depth). | [3, 6, 10] | 
iterations | Number of trees (similar to  n_estimators). | [100, 200, 500] | 
learning_rate | Step size for gradient boosting.  | [0.01, 0.1, 0.2] | 
l2_leaf_reg | L2 regularization term on leaf weights.  | [3, 5, 10] | 
border_count | Number of splits for numerical features.  | [32, 64, 128] | 
bagging_temperature | Controls the fraction of samples used for bagging.  | [0, 1, 5] | 
one_hot_max_size | Maximum cardinality for categorical features to use one-hot encoding.  | [2, 5, 10] | 
random_strength | Controls randomness in feature splits to improve generalization.  | [1, 5, 10] | 
3. Example Param Grids
Random Forest
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [5, 10, 15],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4],
    'max_features': ['sqrt', 'log2', None]
}
XGBoost
param_grid = {
    'n_estimators': [100, 200, 300],
    'learning_rate': [0.01, 0.1, 0.2],
    'max_depth': [3, 5, 7],
    'subsample': [0.6, 0.8, 1.0],
    'colsample_bytree': [0.6, 0.8, 1.0],
    'gamma': [0, 1, 5]
}
LightGBM
param_grid = {
    'n_estimators': [100, 200, 300],
    'learning_rate': [0.01, 0.05, 0.1],
    'num_leaves': [31, 63, 127],
    'feature_fraction': [0.6, 0.8, 1.0],
    'bagging_fraction': [0.6, 0.8, 1.0],
    'lambda_l1': [0, 0.1, 1],
    'lambda_l2': [0, 0.1, 1]
}
CatBoost
python
Copy code
param_grid = {
    'iterations': [100, 200, 500],
    'learning_rate': [0.01, 0.1, 0.2],
    'depth': [3, 6, 10],
    'l2_leaf_reg': [3, 5, 10],
    'random_strength': [1, 5, 10]
}
4. Summary
- Start Simple: Use the most impactful parameters (
n_estimators,max_depth,learning_rate). - Regularization: Use 
min_samples_split,min_child_weight, andlambda_l1/lambda_l2for controlling overfitting. - Iterate Gradually: Expand your param grid once you identify promising ranges.
 - Specialized Models:
 - Random Forest: Focus on 
max_depth,n_estimators,bootstrap. - Boosting Models: Balance 
learning_rate,n_estimators, and regularization. 
This guide integrates all major parameters to ensure effective hyperparameter tuning. Let me know if you'd like further additions or clarifications!