sns.scatterplot()
Scatter plot to show relationships between two numerical variables.
The sns.scatterplot() function in Seaborn is a fundamental tool for visualizing relationships between two numerical variables. It creates a scatter plot where each point represents an observation.
sns.scatterplot(
    data=None,
    *,
    x=None,
    y=None,
    hue=None,
    style=None,
    size=None,
    palette=None,
    hue_order=None,
    size_order=None,
    sizes=None,
    legend="auto",
    ax=None,
    **kwargs,
)
Key Parameters
Primary Parameters
data: The dataset (Pandas DataFrame or array-like) to visualize.x: Variable plotted on the x-axis.y: Variable plotted on the y-axis.
Aesthetic Mappings
hue: Differentiates points by color based on a categorical or numerical variable.style: Differentiates points by marker style based on a categorical variable.size: Differentiates points by size based on a numerical or categorical variable.palette: Specifies the color palette for thehuevariable.
Order and Sizes
hue_order: Specifies the order of categories for thehuevariable.size_order: Specifies the order of categories for thesizevariable.sizes: Specifies a range for marker sizes (e.g.,(min_size, max_size)).
Legend
legend: Controls the display of the legend ("auto","brief","full", orFalse).
Axes
ax: Matplotlib Axes object to draw the plot on.
Additional Styling (*kwargs)
- Supports additional Matplotlib arguments like 
alpha,linewidth, etc. 
Dataset Preparation
First, let's load and inspect the Titanic dataset.
import seaborn as sns
import matplotlib.pyplot as plt
# Load Titanic dataset
titanic = sns.load_dataset('titanic')
# Display first few rows of the dataset
titanic.head()survived  | pclass  | sex  | age  | sibsp  | parch  | fare  | embarked  | class  | who  | adult_male  | deck  | embark_town  | alive  | alone  | |
0  | 0  | 3  | male  | 22.0  | 1  | 0  | 7.2500  | S  | Third  | man  | True  | NaN  | Southampton  | no  | False  | 
1  | 1  | 1  | female  | 38.0  | 1  | 0  | 71.2833  | C  | First  | woman  | False  | C  | Cherbourg  | yes  | False  | 
2  | 1  | 3  | female  | 26.0  | 0  | 0  | 7.9250  | S  | Third  | woman  | False  | NaN  | Southampton  | yes  | True  | 
3  | 1  | 1  | female  | 35.0  | 1  | 0  | 53.1000  | S  | First  | woman  | False  | C  | Southampton  | yes  | False  | 
4  | 0  | 3  | male  | 35.0  | 0  | 0  | 8.0500  | S  | Third  | man  | True  | NaN  | Southampton  | no  | True  | 
1. Basic Scatter Plot
sns.scatterplot(data=df, x='Variable1', y='Variable2')
plt.title("Basic Scatter Plot")
plt.show()Use Case: Visualizing the relationship between two continuous variables.
sns.scatterplot(data=titanic, x='age', y='fare')
plt.title("Basic Scatter Plot")
plt.show()2. Scatter Plot with hue
sns.scatterplot(data=df, x='Variable1', y='Variable2', hue='Category')
plt.title("Scatter Plot with Hue")
plt.show()Key Parameter: hue differentiates points by color.
Use the hue parameter to color points based on survival status (survived).
sns.scatterplot(data=titanic, x='age', y='fare', hue='survived')
plt.title("Scatter Plot with Hue")
plt.legend(title="Survived")
plt.show()
3. Scatter Plot with style
sns.scatterplot(data=df, x='Variable1', y='Variable2', hue='Category', style='SubCategory')
plt.title("Scatter Plot with Hue and Style")
plt.show()Key Parameter: style adds marker shapes based on a categorical variable.
Differentiate points by the class of the ticket (class).
sns.scatterplot(data=titanic, x='age', y='fare', hue='survived', style='sex')
plt.title("Scatter Plot with Hue and Style")
plt.legend(title="Survived/Class")
plt.show()
4. Scatter Plot with size
sns.scatterplot(data=df, x='Variable1', y='Variable2', size='NumericCategory', sizes=(20, 200))
plt.title("Scatter Plot with Variable Sizes")
plt.show()
Key Parameter: size maps a variable to the size of points.
Map the size of points to the number of siblings/spouses aboard (sibsp).
sns.scatterplot(data=titanic, x='age', y='fare', size='sibsp', sizes=(20, 200))
plt.title("Scatter Plot with Variable Sizes")
plt.legend(title="Number of Siblings/Spouses")
plt.show()5. Using palette for Custom Colors
sns.scatterplot(data=df, x='Variable1', y='Variable2', hue='Category', palette='coolwarm')
plt.title("Scatter Plot with Custom Palette")
plt.show()- Key Parameter: 
palettecustomizes the colors for thehuevariable. - We must have hue before we can use palette.
 - Apply a custom palette to the survival status (
hue). 
sns.scatterplot(data=titanic, x='age', y='fare', hue='survived', palette='dark')
plt.title("Scatter Plot with Custom Palette")
plt.legend(title="Survived")
plt.show()6. Ordering Categories with hue_order
sns.scatterplot(data=df, x='Variable1', y='Variable2', hue='Category', hue_order=['C', 'B', 'A'])
plt.title("Scatter Plot with Ordered Hue")
plt.show()
Key Parameter: hue_order sets the order of categories for color mapping.
Specify the order of the survival categories.
sns.scatterplot(data=titanic, x='age', y='fare', hue='survived', hue_order=[1, 0], palette='dark')
plt.title("Scatter Plot with Ordered Hue")
plt.legend(title="Survived")
plt.show()
7. Combining hue, style, and size
sns.scatterplot(
    data=df,
    x='Variable1',
    y='Variable2',
    hue='Category',
    style='SubCategory',
    size='NumericCategory',
    sizes=(50, 300),
    palette='viridis'
)
plt.title("Comprehensive Scatter Plot")
plt.show()
Use Case: Highly customized visualization combining multiple aesthetics.
Use multiple parameters to encode data:
hue: Survival status.style: Ticket class.size: Number of siblings/spouses aboard.
sns.scatterplot(
    data=titanic, 
    x='age', 
    y='fare', 
    hue='survived', 
    style='class', 
    size='sibsp', 
    sizes=(30, 300), 
    palette='dark'
)
plt.title("Comprehensive Scatter Plot")
plt.legend(title="Survived/Class/SibSp")
plt.show()8. Scatter Plot with Transparency
sns.scatterplot(data=df, x='Variable1', y='Variable2', alpha=0.6)
plt.title("Scatter Plot with Transparency")
plt.show()
Key Styling: alpha adjusts point transparency for overlapping points.
Add transparency to reduce overlap when points cluster.
sns.scatterplot(data=titanic, x='age', y='fare', alpha=0.3, palette='dark', hue='sex', size='parch', sizes=(20, 200))
plt.title("Scatter Plot with Transparency")
plt.show()9. Scatter Plot with Custom Marker Sizes (sizes)
Control marker size range manually for better readability.
sns.scatterplot(data=titanic, x='age', y='fare', size='sibsp', sizes=(50, 800))
plt.title("Scatter Plot with Custom Marker Sizes")
plt.legend(title="SibSp")
plt.show()
10. Scatter Plot with a Specific Axis
Use ax to draw the scatter plot on a predefined Matplotlib axes object.
fig, ax = plt.subplots(figsize=(20, 6))
sns.scatterplot(
    data=titanic, 
    x='age', 
    y='fare', 
    hue='survived', 
    style='class', 
    size='sibsp', 
    sizes=(30, 300), 
    palette='dark'
)
plt.title("Comprehensive Scatter Plot")
plt.legend(title="Survived/Class/SibSp")
plt.show()Use Cases for Data Scientists
- Exploratory Data Analysis (EDA):
 - Understand relationships between variables.
 - Spot trends, clusters, or outliers.
 - Multivariate Analysis:
 - Combine 
hue,size, andstyleto analyze more dimensions in the data. - Feature Engineering:
 - Identify variable relationships for feature interactions.
 - Model Validation:
 - Plot actual vs. predicted values for regression models.
 
Practical Notes
- Customizing Legends:
 - Use 
legend="brief"for simplified legends orlegend=Falseto remove them. - Dealing with Overlapping Points:
 - Adjust 
alphafor transparency. - Use 
styleto add marker variety. - Scaling Sizes:
 - Use the 
sizesparameter to control the range of marker sizes for better readability. - Avoid Clutter:
 - For large datasets, subset data or combine with 
sns.relplot()for faceting. 
References in Your Machine Learning Guide
Use this function during:
- EDA: To explore relationships and correlations between features.
 - Visualization: To present findings with clear, concise scatter plots.
 - Model Validation: To analyze model predictions versus actual outcomes.