Status
Done
Text
sns.histplot()
Text 1
Displays a histogram of a numerical variable.
sns.histplot()
is a powerful function in Seaborn for visualizing the distribution of a numerical variable. It supports histograms, KDE (Kernel Density Estimation), cumulative distributions, and more.
General Syntax
python
CopyEdit
sns.histplot(
data=None,
x=None,
y=None,
hue=None,
weights=None,
stat="count",
bins="auto",
binwidth=None,
binrange=None,
discrete=False,
cumulative=False,
common_bins=True,
common_norm=True,
kde=False,
kde_kws=None,
log_scale=False,
element="bars",
fill=True,
multiple="layer",
shrink=1,
alpha=None,
palette=None,
linewidth=None,
line_kws=None,
legend=True
)
Key Parameters
Parameter | Description |
data | DataFrame or array-like dataset. |
x , y | Numerical variable(s) to plot. |
hue | Color-coding by a categorical variable. |
bins | Number of bins (default: "auto" ). |
binwidth | Width of each bin. |
binrange | Tuple specifying range of bins (min, max) . |
cumulative | If True , plots a cumulative histogram. |
kde | If True , overlays a KDE curve. |
element | bars , step , or poly for different histogram styles. |
multiple | layer , dodge , stack , fill for handling multiple histograms. |
log_scale | If True , applies log scaling to axes. |
fill | If True , fills bars with color. |
alpha | Adjust transparency (0 = transparent, 1 = opaque). |
Dataset Setup
We will use the penguins
dataset from Seaborn for all examples.
python
CopyEdit
import seaborn as sns
import matplotlib.pyplot as plt
# Load dataset
data = sns.load_dataset('penguins')
data.head()
The penguins
dataset includes:
- species: Penguin species.
- island: Location of observation.
- bill_length_mm: Length of the bill.
- bill_depth_mm: Depth of the bill.
- flipper_length_mm: Flipper length.
- body_mass_g: Body mass.
- sex: Gender.
species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex |
Adelie | Torgersen | 39.1 | 18.7 | 181 | 3750 | Male |
Adelie | Torgersen | 39.5 | 17.4 | 186 | 3800 | Female |
Adelie | Torgersen | 40.3 | 18.0 | 195 | 3250 | Female |
1. Basic Histogram
Plot distribution of bill_length_mm
python
CopyEdit
sns.histplot(data=data, x="bill_length_mm")
plt.title("Histogram of Bill Length")
plt.show()
Explanation
- This creates a histogram with the default number of bins.
- The x-axis represents bill length, and the y-axis represents frequency.
2. Adjusting Number of Bins
Specify bins=30
for finer granularity
python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", bins=30)
plt.title("Histogram of Bill Length with 30 Bins")
plt.show()
Explanation
- Increasing the number of bins provides a more detailed view of the distribution.
3. Overlay KDE Curve
Use kde=True
to add a KDE plot
python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", kde=True)
plt.title("Histogram of Bill Length with KDE")
plt.show()
Explanation
- KDE (Kernel Density Estimation) provides a smooth approximation of the data distribution.
4. Histogram with hue
(Grouping by Category)
Differentiate by species
python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", kde=True)
plt.title("Histogram of Bill Length by Species")
plt.show()
Explanation
- Uses different colors to separate species.
- Overlayed KDE plots for each group.
5. multiple
Parameter
Using multiple="stack"
python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", multiple="stack")
plt.title("Stacked Histogram of Bill Length by Species")
plt.show()
Explanation
- Stacks bars to show group contributions to the total.
Using multiple="dodge"
python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", multiple="dodge")
plt.title("Dodged Histogram of Bill Length by Species")
plt.show()
Explanation
- Places histograms side by side for easier comparison.
6. cumulative=True
Plot Cumulative Distribution
python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", cumulative=True)
plt.title("Cumulative Histogram of Bill Length")
plt.show()
Explanation
- Shows the proportion of data points below each bin.
7. element="step"
for Step Histograms
python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", element="step", kde=True)
plt.title("Step Histogram of Bill Length")
plt.show()
Explanation
- Uses step lines instead of bars for a cleaner look.
8. log_scale=True
for Logarithmic Scale
python
CopyEdit
sns.histplot(data=data, x="body_mass_g", log_scale=True)
plt.title("Log-Scaled Histogram of Body Mass")
plt.show()
Explanation
- Useful for right-skewed data with large values.
9. Customizing Color Palettes
Using palette="coolwarm"
python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", palette="coolwarm")
plt.title("Histogram with Coolwarm Palette")
plt.show()
Explanation
- Adjusts colors for better visual clarity.
10. Custom Transparency (alpha
)
Set alpha=0.5
to adjust transparency
python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", multiple="stack", alpha=0.5)
plt.title("Histogram with Transparency")
plt.show()
Explanation
- Makes bars semi-transparent for better visibility.
11. Adjusting Plot Size (height
and aspect
)
Increase Figure Size
python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", height=6, aspect=1.5)
plt.title("Larger Histogram of Bill Length")
plt.show()
Explanation
height=6
: Increases plot height.aspect=1.5
: Adjusts width-to-height ratio.
12. Removing the Legend (legend=False
)
python
CopyEdit
sns.histplot(data=data, x="bill_length_mm", hue="species", legend=False)
plt.title("Histogram without Legend")
plt.show()
Explanation
- Hides the legend for a cleaner look.
Final Example: Combining Parameters
Customizing Multiple Elements
python
CopyEdit
sns.histplot(
data=data,
x="bill_length_mm",
hue="species",
kde=True,
bins=25,
multiple="stack",
alpha=0.6,
palette="viridis"
)
plt.title("Customized Histogram of Bill Length")
plt.show()
Explanation
bins=25
: Controls bin count.kde=True
: Adds KDE.multiple="stack"
: Stacks bars.alpha=0.6
: Adjusts transparency.palette="viridis"
: Uses the Viridis color scheme.
Conclusion
sns.histplot()
is a versatile function for visualizing distributions, supporting bin control, KDE, grouping, cumulative histograms, log scales, and aesthetic customizations.
By mastering sns.histplot()
, you can explore numerical data distributions effectively! 🚀