histplot
📂

histplot

Status
Done
Text

sns.histplot()

Text 1

Displays a histogram of a numerical variable.

sns.histplot() is a powerful function in Seaborn for visualizing the distribution of a numerical variable. It supports histograms, KDE (Kernel Density Estimation), cumulative distributions, and more.

General Syntax

python
CopyEdit
sns.histplot(
    data=None,
    x=None,
    y=None,
    hue=None,
    weights=None,
    stat="count",
    bins="auto",
    binwidth=None,
    binrange=None,
    discrete=False,
    cumulative=False,
    common_bins=True,
    common_norm=True,
    kde=False,
    kde_kws=None,
    log_scale=False,
    element="bars",
    fill=True,
    multiple="layer",
    shrink=1,
    alpha=None,
    palette=None,
    linewidth=None,
    line_kws=None,
    legend=True
)

Key Parameters

Parameter
Description
data
DataFrame or array-like dataset.
xy
Numerical variable(s) to plot.
hue
Color-coding by a categorical variable.
bins
Number of bins (default: "auto").
binwidth
Width of each bin.
binrange
Tuple specifying range of bins (min, max).
cumulative
If True, plots a cumulative histogram.
kde
If True, overlays a KDE curve.
element
barsstep, or poly for different histogram styles.
multiple
layerdodgestackfill for handling multiple histograms.
log_scale
If True, applies log scaling to axes.
fill
If True, fills bars with color.
alpha
Adjust transparency (0 = transparent, 1 = opaque).

Dataset Setup

We will use the penguins dataset from Seaborn for all examples.

python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt

# Load dataset
data = sns.load_dataset('penguins')
data.head()

The penguins dataset includes:

  • species: Penguin species.
  • island: Location of observation.
  • bill_length_mm: Length of the bill.
  • bill_depth_mm: Depth of the bill.
  • flipper_length_mm: Flipper length.
  • body_mass_g: Body mass.
  • sex: Gender.
species
island
bill_length_mm
bill_depth_mm
flipper_length_mm
body_mass_g
sex
Adelie
Torgersen
39.1
18.7
181
3750
Male
Adelie
Torgersen
39.5
17.4
186
3800
Female
Adelie
Torgersen
40.3
18.0
195
3250
Female

1. Basic Histogram

Plot distribution of bill_length_mm

python
CopyEdit
sns.histplot(data=data, x="bill_length_mm")
plt.title("Histogram of Bill Length")
plt.show()

Explanation

  • This creates a histogram with the default number of bins.
  • The x-axis represents bill length, and the y-axis represents frequency.

2. Adjusting Number of Bins

Specify bins=30 for finer granularity

python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", bins=30)
plt.title("Histogram of Bill Length with 30 Bins")
plt.show()

Explanation

  • Increasing the number of bins provides a more detailed view of the distribution.

3. Overlay KDE Curve

Use kde=True to add a KDE plot

python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", kde=True)
plt.title("Histogram of Bill Length with KDE")
plt.show()

Explanation

  • KDE (Kernel Density Estimation) provides a smooth approximation of the data distribution.

4. Histogram with hue (Grouping by Category)

Differentiate by species

python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", kde=True)
plt.title("Histogram of Bill Length by Species")
plt.show()

Explanation

  • Uses different colors to separate species.
  • Overlayed KDE plots for each group.

5. multiple Parameter

Using multiple="stack"

python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", multiple="stack")
plt.title("Stacked Histogram of Bill Length by Species")
plt.show()

Explanation

  • Stacks bars to show group contributions to the total.

Using multiple="dodge"

python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", multiple="dodge")
plt.title("Dodged Histogram of Bill Length by Species")
plt.show()

Explanation

  • Places histograms side by side for easier comparison.

6. cumulative=True

Plot Cumulative Distribution

python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", cumulative=True)
plt.title("Cumulative Histogram of Bill Length")
plt.show()

Explanation

  • Shows the proportion of data points below each bin.

7. element="step" for Step Histograms

python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", element="step", kde=True)
plt.title("Step Histogram of Bill Length")
plt.show()

Explanation

  • Uses step lines instead of bars for a cleaner look.

8. log_scale=True for Logarithmic Scale

python
CopyEdit
sns.histplot(data=data, x="body_mass_g", log_scale=True)
plt.title("Log-Scaled Histogram of Body Mass")
plt.show()

Explanation

  • Useful for right-skewed data with large values.

9. Customizing Color Palettes

Using palette="coolwarm"

python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", palette="coolwarm")
plt.title("Histogram with Coolwarm Palette")
plt.show()

Explanation

  • Adjusts colors for better visual clarity.

10. Custom Transparency (alpha)

Set alpha=0.5 to adjust transparency

python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", multiple="stack", alpha=0.5)
plt.title("Histogram with Transparency")
plt.show()

Explanation

  • Makes bars semi-transparent for better visibility.

11. Adjusting Plot Size (height and aspect)

Increase Figure Size

python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", height=6, aspect=1.5)
plt.title("Larger Histogram of Bill Length")
plt.show()

Explanation

  • height=6: Increases plot height.
  • aspect=1.5: Adjusts width-to-height ratio.

12. Removing the Legend (legend=False)

python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", legend=False)
plt.title("Histogram without Legend")
plt.show()

Explanation

  • Hides the legend for a cleaner look.

Final Example: Combining Parameters

Customizing Multiple Elements

python
CopyEdit
sns.histplot(
    data=data,
    x="bill_length_mm",
    hue="species",
    kde=True,
    bins=25,
    multiple="stack",
    alpha=0.6,
    palette="viridis"
)
plt.title("Customized Histogram of Bill Length")
plt.show()

Explanation

  • bins=25: Controls bin count.
  • kde=True: Adds KDE.
  • multiple="stack": Stacks bars.
  • alpha=0.6: Adjusts transparency.
  • palette="viridis": Uses the Viridis color scheme.

Conclusion

sns.histplot() is a versatile function for visualizing distributions, supporting bin control, KDE, grouping, cumulative histograms, log scales, and aesthetic customizations.

By mastering sns.histplot(), you can explore numerical data distributions effectively! 🚀