📎

Logistic Regression Model

Classification

Basic Linear Classifiers

Import Statement

Status

Done

Introduction

Logistic Regression is a supervised learning algorithm used primarily for binary classification tasks—i.e., predicting whether an instance belongs to class 0 or class 1. Unlike linear regression, which predicts continuous values, logistic regression predicts probabilities and then classifies based on a decision threshold (commonly 0.5).

‣

Types of Logistic Regression

‣

Assumptions of Logistic Regression

‣

Logistic Regression vs. Linear Regression:

‣

Why Use Logistic Regression?

‣

Terminologies involved in Logistic Regression

LogisticRegression Model

The syntax for LogisticRegression Class is thus. This version includes all key parameters with either default or specified values. Use it as a complete template for advanced tuning and clarity.

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(
    penalty='l2',               # Regularization type
    dual=False,                 # Dual formulation (only for 'liblinear' + 'l2')
    tol=1e-4,                   # Tolerance for stopping criteria
    C=1.0,                      # Inverse of regularization strength
    fit_intercept=True,         # Add intercept term
    intercept_scaling=1,        # Scaling of intercept (used when solver='liblinear')
    class_weight=None,          # Weight for classes (None or 'balanced')
    random_state=8,             # Reproducibility
    solver='lbfgs',             # Optimization algorithm
    max_iter=100,               # Maximum iterations for convergence
    multi_class='auto',         # Strategy for multi-class classification
    verbose=0,                  # Verbosity level
    warm_start=False,           # Reuse previous solution on new fit
    n_jobs=None,                # Number of CPU cores (only for 'liblinear' and 'saga')
    l1_ratio=None               # Elastic Net mixing parameter (only if penalty='elasticnet')
)

Parameter	Description	Options	Default	Notes
`penalty`	Type of regularization applied	`'l2'`, `'l1'`, `'elasticnet'`, `'none'`	`'l2'`	`'l1'` and `'elasticnet'` only supported with `liblinear` and `saga`. `'none'` = no regularization
`C`	Inverse of regularization strength (`1/λ`)	Float > 0 (e.g., `0.01`, `1.0`, `10.0`)	`1.0`	Smaller values → stronger regularization
`solver`	Algorithm to optimize the cost function	`'newton-cg'`, `'lbfgs'`, `'liblinear'`, `'sag'`, `'saga'`	`'lbfgs'`	Not all solvers support all penalties (See note)
`max_iter`	Maximum number of iterations taken for the solver to converge	Integer	`100`	Increase if convergence warning is issued
`random_state`	Seed for random number generation (used by solver and shuffling)	`None` or integer	`None`	Ensures reproducibility when set to an integer
`fit_intercept`	Whether to calculate the intercept for the model	`True`, `False`	`True`	If `False`, assumes data is already centered
`intercept_scaling`	Useful only when `fit_intercept=True` and solver=`liblinear`	Float	`1`	Rarely used
`class_weight`	Adjusts weights inversely proportional to class frequencies	`None`, `'balanced'`, or dictionary `{class_label: weight}`	`None`	`'balanced'` is useful for imbalanced datasets
`dual`	Formulation of the optimization problem (primal vs dual)	`False`, `True`	`False`	Only applicable when `solver='liblinear'` and `penalty='l2'`
`multi_class`	Strategy for multi-class classification	`'auto'`, `'ovr'`, `'multinomial'`	`'auto'`	`'multinomial'` preferred with `lbfgs`, `newton-cg`, `sag`, `saga`; `liblinear` supports only `'ovr'`
`n_jobs`	Number of CPU cores used for computation (only for `liblinear`, `saga`)	`None` (1 core), `-1` (all cores), or integer	`None`	Speeds up computation for large datasets
`verbose`	Verbosity level for solver output	`0`, `1`, or higher	`0`	Mainly for debugging
`warm_start`	Whether to reuse the previous solution as initialization	`True`, `False`	`False`	Can be used to continue fitting from previous state
`l1_ratio`	Elastic net mixing parameter (only if `penalty='elasticnet'`)	Float (0 ≤ `l1_ratio` ≤ 1)	`None`	`l1_ratio=1` → pure L1, `0` → pure L2, in between → mixture

Explanation of Key Parameters

‣

Parameter`solver` : The Brain of Your Model

‣

Parameter`penalty` : Your Model's Self-Control System

‣

Parameter`max_iter` : Your Model's Patience Setting

‣

Parameter`Inverse` of regularization strength (`c`)

‣

Parameter`class_weight`: Your Model's Fairness Referee

‣

Parameter`dual` : Your Model's Problem-Solving Strategy

‣

Parameter`tol` : Practical Reference Guide

‣

Parameter: `multi_class` Practical Reference Guide

‣

Parameter: `warm_start` Connecting result together

‣

Parameter:`fit_intercept` Parameter: Practical Reference Guide

‣

Parameter: `intercept_scaling` Practical Reference Guide

‣

Scikit-learn Implementation: Logistic Regression

‣

Key Parameters: LogisticRegression()

Model Coefficients Interpretation


print("Intercept (β₀):", model.intercept_)
print("Coefficients (β₁, β₂):", model.coef_)

Term	Interpretation
Intercept	Log-odds when all features are 0
Coefficients	Impact of each predictor on log-odds of class 1

To interpret them as odds ratios:


np.exp(model.coef_)

Logistic Regression Evaluation Metrics

Metric	Meaning	Usage
Accuracy	Proportion of correct predictions	Simple, but may be misleading on imbalanced data
Confusion Matrix	`TP,` `TN`, `FP`, `FN`	Breaks down model performance
Precision	`TP` / (`TP` + `FP`)	Important when false positives are costly (e.g., loan approvals)
Recall	`TP` / (`TP` + `FN`)	Important when false negatives are costly (e.g., fraud detection)
F1-Score	Harmonic mean of precision and recall	Balanced metric for performance
ROC-AUC	Area under ROC curve	Probability that model ranks positive example higher than negative

Visualizing Logistic Curve


import matplotlib.pyplot as plt

z = np.linspace(-10, 10, 200)
sigmoid = 1 / (1 + np.exp(-z))

plt.plot(z, sigmoid)
plt.title("Sigmoid Function")
plt.xlabel("z")
plt.ylabel("σ(z)")
plt.grid(True)
plt.show()

Feature Engineering for Logistic Regression

Task	Tools/Methods
Handle Categorical Data	One-Hot Encoding, Label Encoding
Standardize Features	StandardScaler
Interaction Terms	PolynomialFeatures or domain-driven logic
Outlier Removal	Z-score or IQR method
Feature Selection	Recursive Feature Elimination (RFE), L1 penalty (Lasso)

Prediction vs. Inference in Logistic Regression

Aspect	Prediction	Inference
Goal	Predict churn	Understand factors leading to churn
Method	`model.predict()`	Analyze `model.coef_` and p-values
Application	Deployment	Business strategy and reporting

You can use statsmodels to do inference:


import statsmodels.api as sm

X_const = sm.add_constant(X_train)
logit_model = sm.Logit(y_train, X_const)
result = logit_model.fit()
print(result.summary())

This gives access to p-values, confidence intervals, and log-likelihoods for hypothesis testing.

Conclusion

Logistic Regression is a foundational classification model with rich interpretability and practical relevance in finance and business:

Works well for binary outcomes (churn, default, fraud)
Produces probabilities, not just labels
Highly interpretable and efficient
Flexible with regularization (L1, L2)

Perfect — here's the extended version of your Logistic Regression Study Note, now with:

A full section on Multinomial Logistic Regression
A realistic case study on customer churn prediction

‣

Multinomial Logistic Regression

Comprehensive Study Notes: Logistic Regression Hyperparameter Tuning

Overview and Learning Objectives
Step 1: Dataset Generation and Exploration
Step 2: Data Preprocessing
Step 3: Baseline Model Implementation
Step 4: Parameter Understanding
Step 5: Grid Search Hyperparameter Tuning
Step 6: Randomized Search Strategy
Step 7: Model Comparison and Evaluation
Step 8: Best Model Analysis
Step 9: Cross-Validation Analysis
Step 10: Practical Recommendations
Key Takeaways and Summary

Overview and Learning Objectives

What This Tutorial Teaches:

Complete workflow for logistic regression implementation
Systematic approach to hyperparameter optimization
Understanding of each parameter's impact on model performance
Comparison between different tuning strategies
Real-world best practices for machine learning projects

Prerequisites:

Basic Python programming knowledge
Understanding of classification problems
Familiarity with scikit-learn library
Basic statistics and machine learning concepts

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import make_classification, load_breast_cancer, load_wine
from sklearn.model_selection import train_test_split, GridSearchCV, RandomizedSearchCV, cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score, roc_auc_score, roc_curve
from sklearn.pipeline import Pipeline
import warnings
warnings.filterwarnings('ignore')

‣

Step 1: Dataset Generation and Exploration

‣

Step 2: Data Preprocessing

‣

Step 3: Baseline Model Implementation

‣

Step 4: Parameter Understanding

‣

Step 5: Hyper-parameter Tuning: Grid Search Strategy

‣

Step 6: Hyper-parameter Tuning : Randomized Search Strategy

‣

Hyperparameter Visualization Note: Logistic Regression

‣

Step 7: Model Comparison and Evaluation

‣

Step 8: Best Model Analysis

‣

Plot ROC Curve for Best Model

‣

Step 9: Cross-Validation Analysis

‣