FUNCTION
sklearn.metrics
:
Multi-select
Status
In progress
URL
TABLE OF CONTENT
1. Foundations: Understanding Metrics and Scorers
Before diving into individual metrics, get familiar with how scikit-learn handles scoring.
- Topics to Cover:
check_scoring
→ determine if an estimator has a scoring method.get_scorer
,get_scorer_names
→ retrieving predefined scorers.make_scorer
→ turning custom functions into scorers.- Goal: Understand how metrics connect with model selection (e.g.,
GridSearchCV
,cross_val_score
).
2. Classification Metrics (Most Used in Practice)
Start with binary classification → multiclass → multilabel.
- Core Metrics (binary/multiclass):
accuracy_score
confusion_matrix
,ConfusionMatrixDisplay
precision_score
,recall_score
,f1_score
,fbeta_score
classification_report
roc_auc_score
,roc_curve
,RocCurveDisplay
precision_recall_curve
,PrecisionRecallDisplay
average_precision_score
- Advanced / Less Common:
balanced_accuracy_score
brier_score_loss
log_loss
matthews_corrcoef
cohen_kappa_score
jaccard_score
zero_one_loss
det_curve
,DetCurveDisplay
top_k_accuracy_score
- Specialized:
class_likelihood_ratios
(diagnostic testing context).precision_recall_fscore_support
(per-class breakdown).multilabel_confusion_matrix
.- Goal: Be able to evaluate churn prediction, fraud detection, spam detection using the right metric for class imbalance.
3. Regression Metrics (Continuous Predictions)
- Core Metrics:
mean_absolute_error (MAE)
mean_squared_error (MSE)
root_mean_squared_error (RMSE)
r2_score
(coefficient of determination)- Advanced Metrics:
median_absolute_error
explained_variance_score
max_error
mean_absolute_percentage_error (MAPE)
- Specialized Loss Functions (for GLMs, quantile regression, deviance):
d2_absolute_error_score
,d2_pinball_score
,d2_tweedie_score
mean_pinball_loss
mean_poisson_deviance
mean_gamma_deviance
mean_tweedie_deviance
mean_squared_log_error
,root_mean_squared_log_error
- Goal: Be comfortable choosing between MAE, RMSE, R², and others depending on forecasting/business context.
4. Ranking & Information Retrieval Metrics
- Use case: Recommendation systems, search relevance.
- Metrics:
dcg_score
(Discounted Cumulative Gain)ndcg_score
(Normalized DCG)coverage_error
label_ranking_loss
label_ranking_average_precision_score
- Goal: Learn how to measure ranking quality instead of raw predictions.
5. Clustering Metrics
- Supervised (with ground truth labels):
adjusted_rand_score
,rand_score
mutual_info_score
,adjusted_mutual_info_score
,normalized_mutual_info_score
homogeneity_score
,completeness_score
,v_measure_score
,homogeneity_completeness_v_measure
fowlkes_mallows_score
cluster.contingency_matrix
,cluster.pair_confusion_matrix
- Unsupervised (internal validation):
silhouette_score
,silhouette_samples
calinski_harabasz_score
davies_bouldin_score
- Goal: Understand how to judge quality of KMeans/DBSCAN clustering with and without ground truth.
6. Biclustering Metrics
- Niche, but useful for gene expression data or matrix factorization.
consensus_score
7. Distance & Pairwise Metrics (Very Useful in ML)
- Distances:
pairwise.euclidean_distances
pairwise.manhattan_distances
pairwise.cosine_distances
/cosine_similarity
pairwise.nan_euclidean_distances
pairwise.haversine_distances
- Kernels:
linear_kernel
,rbf_kernel
,polynomial_kernel
,sigmoid_kernel
laplacian_kernel
,chi2_kernel
,additive_chi2_kernel
- Utilities:
pairwise_distances
,pairwise_distances_chunked
pairwise_distances_argmin
,pairwise_distances_argmin_min
- Goal: Learn which distance or kernel to use in KNN, SVM, clustering.
8. Visualization Tools for Metrics
ConfusionMatrixDisplay
RocCurveDisplay
PrecisionRecallDisplay
DetCurveDisplay
PredictionErrorDisplay
- Goal: Practice making evaluation plots alongside raw scores.
Name | Text | Status |
---|---|---|
In progress | ||
Not started | ||
Not started | ||
Not started | ||
Not started | ||
Not started | ||
Not started | ||
Not started |