Metrics for evaluating models.

FUNCTION

sklearn.metrics:

Multi-select

Status

In progress

URL

TABLE OF CONTENT

1. Foundations: Understanding Metrics and Scorers

Before diving into individual metrics, get familiar with how scikit-learn handles scoring.

Topics to Cover:

check_scoring → determine if an estimator has a scoring method.
get_scorer, get_scorer_names → retrieving predefined scorers.
make_scorer → turning custom functions into scorers.

Goal: Understand how metrics connect with model selection (e.g., GridSearchCV, cross_val_score).

2. Classification Metrics (Most Used in Practice)

Start with binary classification → multiclass → multilabel.

Core Metrics (binary/multiclass):

accuracy_score
confusion_matrix, ConfusionMatrixDisplay
precision_score, recall_score, f1_score, fbeta_score
classification_report
roc_auc_score, roc_curve, RocCurveDisplay
precision_recall_curve, PrecisionRecallDisplay
average_precision_score

Advanced / Less Common:

balanced_accuracy_score
brier_score_loss
log_loss
matthews_corrcoef
cohen_kappa_score
jaccard_score
zero_one_loss
det_curve, DetCurveDisplay
top_k_accuracy_score

Specialized:

class_likelihood_ratios (diagnostic testing context).
precision_recall_fscore_support (per-class breakdown).
multilabel_confusion_matrix.

Goal: Be able to evaluate churn prediction, fraud detection, spam detection using the right metric for class imbalance.

3. Regression Metrics (Continuous Predictions)

Core Metrics:

mean_absolute_error (MAE)
mean_squared_error (MSE)
root_mean_squared_error (RMSE)
r2_score (coefficient of determination)

Advanced Metrics:

median_absolute_error
explained_variance_score
max_error
mean_absolute_percentage_error (MAPE)

Specialized Loss Functions (for GLMs, quantile regression, deviance):

d2_absolute_error_score, d2_pinball_score, d2_tweedie_score
mean_pinball_loss
mean_poisson_deviance
mean_gamma_deviance
mean_tweedie_deviance
mean_squared_log_error, root_mean_squared_log_error

Goal: Be comfortable choosing between MAE, RMSE, R², and others depending on forecasting/business context.

4. Ranking & Information Retrieval Metrics

Use case: Recommendation systems, search relevance.
Metrics:

dcg_score (Discounted Cumulative Gain)
ndcg_score (Normalized DCG)
coverage_error
label_ranking_loss
label_ranking_average_precision_score

Goal: Learn how to measure ranking quality instead of raw predictions.

5. Clustering Metrics

Supervised (with ground truth labels):

adjusted_rand_score, rand_score
mutual_info_score, adjusted_mutual_info_score, normalized_mutual_info_score
homogeneity_score, completeness_score, v_measure_score, homogeneity_completeness_v_measure
fowlkes_mallows_score
cluster.contingency_matrix, cluster.pair_confusion_matrix

Unsupervised (internal validation):

silhouette_score, silhouette_samples
calinski_harabasz_score
davies_bouldin_score

Goal: Understand how to judge quality of KMeans/DBSCAN clustering with and without ground truth.

6. Biclustering Metrics

Niche, but useful for gene expression data or matrix factorization.

consensus_score

7. Distance & Pairwise Metrics (Very Useful in ML)

Distances:

pairwise.euclidean_distances
pairwise.manhattan_distances
pairwise.cosine_distances / cosine_similarity
pairwise.nan_euclidean_distances
pairwise.haversine_distances

Kernels:

linear_kernel, rbf_kernel, polynomial_kernel, sigmoid_kernel
laplacian_kernel, chi2_kernel, additive_chi2_kernel

Utilities:

pairwise_distances, pairwise_distances_chunked
pairwise_distances_argmin, pairwise_distances_argmin_min

Goal: Learn which distance or kernel to use in KNN, SVM, clustering.

8. Visualization Tools for Metrics

ConfusionMatrixDisplay
RocCurveDisplay
PrecisionRecallDisplay
DetCurveDisplay
PredictionErrorDisplay
Goal: Practice making evaluation plots alongside raw scores.

Name	Text	Status
1. Foundations: Understanding Metrics and Scorers		In progress
2. Classification Metrics (Most Used in Practice)		Not started
3. Regression Metrics (Continuous Predictions)		Not started
4. Ranking & Information Retrieval Metrics		Not started
5. Clustering Metrics		Not started
6. Biclustering Metrics		Not started
7. Distance & Pairwise Metrics (Very Useful in ML)		Not started
8. Visualization Tools for Metrics		Not started