Seaborn Categorical Plots, Regression Plots, FacetGrid, PairGrid, and JointGrid
The first Seaborn guide introduced the main plotting families.
This guide goes deeper into the plots you use when your analysis has categories, statistical summaries, trend lines, and multi-panel grids.
You will work with original synthetic bootcamp datasets. The examples do not use Seaborn built-in datasets, copied course notebooks, proprietary spreadsheets, or public exercise data.
Files Used In This Guide
Use these CSV files:
seaborn_bootcamp_sessions.csvseaborn_assessment_diagnostics.csv
Place them in the same folder as your notebook or script.
If you keep them in a data/ folder, change the paths:
bootcamp = pd.read_csv("data/seaborn_bootcamp_sessions.csv")
diagnostics = pd.read_csv("data/seaborn_assessment_diagnostics.csv")What You Will Learn
By the end, you should be able to:
- choose the right categorical plot for raw points, spread, or summary
- use
stripplotandswarmplotfor individual observations - use
boxplotandviolinplotfor distribution comparison - use
barplot,pointplot, andcountplotfor estimates and counts - convert axes-level categorical plots into
catplotfacets - build regression charts with
regplot,lmplot, andresidplot - create custom small multiples with
FacetGrid - compare
pairplotandPairGrid - compare
jointplotandJointGrid - decide when grids help and when they create visual noise
1. Setup
Install the required packages:
pip install pandas matplotlib seabornImport the libraries:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as snsSet a readable default theme:
sns.set_theme(style="whitegrid", context="notebook")Load the CSV files:
bootcamp = pd.read_csv("seaborn_bootcamp_sessions.csv")
diagnostics = pd.read_csv("seaborn_assessment_diagnostics.csv")
print(bootcamp.head())
print(diagnostics.head())The bootcamp table has one row per learning session. The diagnostics table has one row per learner assessment.
2. Categorical Plot Families
Seaborn categorical plots answer different questions.
| Plot Type | Function | Best Question |
|---|---|---|
| Categorical scatter | stripplot, swarmplot | What do individual points look like by group? |
| Distribution by category | boxplot, violinplot | How does spread differ by group? |
| Estimate by category | barplot, pointplot | What is the average or summary value by group? |
| Count by category | countplot | How many rows are in each group? |
| Figure-level categorical | catplot | How can I facet categorical plots? |
Start with the question, not the chart name.
3. Strip Plot
A strip plot shows individual observations grouped by category.
plt.figure(figsize=(9, 5))
sns.stripplot(
data=bootcamp,
x="track",
y="project_score",
jitter=True,
alpha=0.8,
)
plt.title("Project Scores By Track")
plt.xlabel("Track")
plt.ylabel("Project Score")
plt.show()jitter=True spreads points horizontally so overlapping observations are easier to see.
Use a strip plot when the dataset is small enough that individual points matter.
4. Strip Plot With Hue
Add hue to compare another category.
plt.figure(figsize=(10, 5))
sns.stripplot(
data=bootcamp,
x="track",
y="project_score",
hue="completed",
jitter=True,
dodge=True,
alpha=0.75,
)
plt.title("Project Scores By Track And Completion Status")
plt.xlabel("Track")
plt.ylabel("Project Score")
plt.legend(title="Completed")
plt.show()dodge=True separates the hue groups inside each category.
5. Swarm Plot
A swarm plot also shows individual points, but it arranges them to reduce overlap.
plt.figure(figsize=(10, 5))
sns.swarmplot(
data=bootcamp,
x="level",
y="quiz_score",
hue="delivery_mode",
)
plt.title("Quiz Scores By Level And Delivery Mode")
plt.xlabel("Level")
plt.ylabel("Quiz Score")
plt.show()Swarm plots are useful for small datasets. They can become crowded or slow with many rows.
6. Catplot For Categorical Scatter
catplot is the figure-level categorical function.
g = sns.catplot(
data=bootcamp,
x="track",
y="project_score",
kind="strip",
hue="completed",
col="cohort",
col_wrap=2,
jitter=True,
height=3.5,
)
g.set_axis_labels("Track", "Project Score")
g.set_titles("Cohort: {col_name}")
g.fig.suptitle("Project Scores Across Cohorts", y=1.04)
plt.show()Use catplot when the same categorical chart should be repeated across cohorts, levels, regions, or other grouping columns.
7. Box Plot
A box plot summarizes the distribution of a numeric value in each category.
plt.figure(figsize=(9, 5))
sns.boxplot(
data=bootcamp,
x="track",
y="study_hours",
)
plt.title("Study Hours By Track")
plt.xlabel("Track")
plt.ylabel("Study Hours")
plt.show()A box plot is good when you care about:
- median
- spread
- skew
- possible outliers
- group-to-group comparison
It hides individual points, so consider overlaying a strip plot if the dataset is small.
plt.figure(figsize=(9, 5))
sns.boxplot(
data=bootcamp,
x="track",
y="study_hours",
color="lightgray",
)
sns.stripplot(
data=bootcamp,
x="track",
y="study_hours",
color="black",
alpha=0.55,
jitter=True,
)
plt.title("Study Hours By Track With Individual Points")
plt.xlabel("Track")
plt.ylabel("Study Hours")
plt.show()8. Box Plot With Hue
Use hue when the distribution needs a second grouping.
plt.figure(figsize=(10, 5))
sns.boxplot(
data=bootcamp,
x="track",
y="quiz_score",
hue="delivery_mode",
)
plt.title("Quiz Score Spread By Track And Delivery Mode")
plt.xlabel("Track")
plt.ylabel("Quiz Score")
plt.show()Keep hue categories limited. Too many groups make box plots hard to read.
9. Violin Plot
A violin plot shows distribution shape using a density estimate.
plt.figure(figsize=(10, 5))
sns.violinplot(
data=bootcamp,
x="track",
y="project_score",
hue="completed",
split=True,
inner="quart",
)
plt.title("Project Score Distribution By Track")
plt.xlabel("Track")
plt.ylabel("Project Score")
plt.show()split=True works when hue has two groups. It places both distributions inside one violin for each x-category.
Use violin plots when distribution shape matters. Use box plots when the audience needs a simpler summary.
10. Bar Plot
A bar plot estimates a summary value for each category.
plt.figure(figsize=(9, 5))
sns.barplot(
data=bootcamp,
x="track",
y="satisfaction",
hue="level",
errorbar=None,
)
plt.title("Average Satisfaction By Track And Level")
plt.xlabel("Track")
plt.ylabel("Average Satisfaction")
plt.show()By default, Seaborn uses the mean.
Use errorbar=None only when you intentionally want a clean teaching chart. In real reporting, error bars can communicate uncertainty.
11. Custom Estimator In A Bar Plot
You can summarize with a function other than the mean.
import numpy as np
plt.figure(figsize=(9, 5))
sns.barplot(
data=bootcamp,
x="track",
y="support_tickets",
hue="delivery_mode",
estimator=np.median,
errorbar=None,
)
plt.title("Median Support Tickets By Track")
plt.xlabel("Track")
plt.ylabel("Median Support Tickets")
plt.show()Median is often better than mean when the metric has outliers.
12. Point Plot
A point plot compares estimates while emphasizing change across categories.
sns.pointplot(
data=bootcamp,
x="cohort",
y="attendance_pct",
hue="track",
errorbar=None,
markers="o",
linestyles="-",
)
plt.title("Average Attendance Across Cohorts")
plt.xlabel("Cohort")
plt.ylabel("Average Attendance (%)")
plt.show()Point plots are useful when the x-axis has a meaningful order, such as cohorts, months, or levels.
Do not use connecting lines when the x-axis categories are unordered.
13. Count Plot
A count plot counts rows in each category.
sns.countplot(
data=bootcamp,
x="track",
hue="completed",
)
plt.title("Session Count By Track And Completion")
plt.xlabel("Track")
plt.ylabel("Number Of Sessions")
plt.show()Use countplot when you need frequency, not a numeric summary.
14. Catplot With Box Facets
catplot can create faceted box plots.
g = sns.catplot(
data=bootcamp,
x="level",
y="project_score",
hue="completed",
col="track",
col_wrap=3,
kind="box",
height=3.5,
)
g.set_axis_labels("Level", "Project Score")
g.set_titles("{col_name}")
g.fig.suptitle("Project Score Spread By Track And Level", y=1.04)
plt.show()This is easier to read than putting every track, level, and completion group into one overloaded chart.
15. Regression With Regplot
regplot draws a scatter plot plus a fitted trend line.
sns.regplot(
data=bootcamp,
x="study_hours",
y="project_score",
scatter_kws={"alpha": 0.75},
line_kws={"color": "red"},
)
plt.title("Study Hours vs Project Score")
plt.xlabel("Study Hours")
plt.ylabel("Project Score")
plt.show()Use regplot for a quick single-panel trend check.
16. Regression With Lmplot And Hue
lmplot is the figure-level regression function.
g = sns.lmplot(
data=bootcamp,
x="study_hours",
y="project_score",
hue="completed",
height=5,
aspect=1.2,
scatter_kws={"alpha": 0.75},
)
g.set_axis_labels("Study Hours", "Project Score")
g.fig.suptitle("Study Hours And Project Score By Completion", y=1.03)
plt.show()Use lmplot when you need groups or facets.
17. Regression Facets
Split the regression by track:
g = sns.lmplot(
data=bootcamp,
x="study_hours",
y="quiz_score",
col="track",
hue="delivery_mode",
col_wrap=3,
height=3.5,
scatter_kws={"alpha": 0.75},
)
g.set_axis_labels("Study Hours", "Quiz Score")
g.set_titles("{col_name}")
g.fig.suptitle("Study Hours And Quiz Score By Track", y=1.04)
plt.show()Faceted regression is helpful when a single line would hide group-specific behavior.
18. Residual Plot
A residual plot shows what remains after fitting the model.
sns.residplot(
data=bootcamp,
x="study_hours",
y="project_score",
lowess=True,
)
plt.title("Residual Pattern For Study Hours And Project Score")
plt.xlabel("Study Hours")
plt.ylabel("Residual")
plt.show()If the residuals have a strong curve or pattern, a simple straight-line trend may not describe the relationship well.
19. FacetGrid
FacetGrid gives you manual control over small multiples.
g = sns.FacetGrid(
data=bootcamp,
row="delivery_mode",
col="track",
hue="completed",
margin_titles=True,
height=3,
)
g.map_dataframe(
sns.scatterplot,
x="study_hours",
y="project_score",
alpha=0.8,
)
g.add_legend(title="Completed")
g.set_axis_labels("Study Hours", "Project Score")
g.fig.suptitle("Study vs Project Score By Track And Delivery Mode", y=1.03)
plt.show()Use FacetGrid when built-in figure-level functions do not give you enough control.
20. FacetGrid With A Categorical Plot
You can map categorical functions too.
g = sns.FacetGrid(
data=bootcamp,
col="cohort",
col_wrap=2,
height=3.5,
sharey=True,
)
g.map_dataframe(
sns.boxplot,
x="track",
y="project_score",
order=["Analytics", "Python", "ML"],
)
g.set_axis_labels("Track", "Project Score")
g.set_titles("Cohort: {col_name}")
for ax in g.axes.flat:
ax.tick_params(axis="x", rotation=25)
g.fig.suptitle("Project Score Spread By Cohort", y=1.04)
plt.show()When mapping categorical plots manually, pass a consistent order so categories appear in the same sequence in every panel.
21. Pairplot
pairplot creates a quick grid of pairwise relationships.
cols = [
"profile",
"study_hours",
"practice_count",
"project_score",
"quiz_score",
"final_score",
]
sns.pairplot(
data=diagnostics[cols],
hue="profile",
corner=True,
plot_kws={"alpha": 0.75},
)
plt.show()Use pairplot for exploration. It is not always the best final chart because it can show too much at once.
22. PairGrid
PairGrid is the customizable version of pairplot.
grid_cols = [
"study_hours",
"practice_count",
"project_score",
"quiz_score",
"final_score",
]
g = sns.PairGrid(
data=diagnostics,
vars=grid_cols,
hue="profile",
corner=False,
)
g.map_diag(sns.histplot)
g.map_upper(sns.scatterplot, alpha=0.75)
g.map_lower(sns.kdeplot, fill=True, alpha=0.35)
g.add_legend(title="Profile")
plt.show()Use PairGrid when you want different plots on the diagonal, upper triangle, and lower triangle.
23. PairGrid Design Tips
Pair grids can become unreadable quickly.
Use these rules:
- Keep the numeric column list short.
- Use
corner=Truewhen the mirrored half is redundant. - Use transparency on scatter points.
- Avoid heavy KDE plots for very large datasets.
- Use a clear
huecolumn with only a few categories.
24. Jointplot
jointplot shows the relationship between two variables plus their marginal distributions.
sns.jointplot(
data=diagnostics,
x="study_hours",
y="final_score",
hue="profile",
kind="scatter",
height=6,
)
plt.show()This is useful when you want the bivariate relationship and each variable's distribution in one figure.
25. Jointplot Kinds
Try different kind values depending on the question.
sns.jointplot(
data=diagnostics,
x="practice_count",
y="final_score",
kind="reg",
height=6,
)
plt.show()sns.jointplot(
data=diagnostics,
x="study_hours",
y="response_time_min",
kind="hex",
height=6,
)
plt.show()Common choices:
scatterfor raw pointsregfor a trend linehistfor binned densitykdefor smooth densityhexfor dense point cloudsresidfor residual pattern
26. JointGrid
JointGrid gives you more manual control than jointplot.
g = sns.JointGrid(
data=diagnostics,
x="study_hours",
y="final_score",
height=6,
)
g.plot_joint(sns.scatterplot, alpha=0.8)
g.plot_marginals(sns.histplot, bins=8, kde=True)
g.fig.suptitle("Study Hours And Final Score", y=1.03)
plt.show()Use JointGrid when you want to choose separate functions for the joint area and the margins.
27. JointGrid With KDE And Rug Marks
You can combine different plot types.
g = sns.JointGrid(
data=diagnostics,
x="practice_count",
y="quiz_score",
height=6,
)
g.plot_joint(sns.kdeplot, fill=True, cmap="Blues")
g.plot_marginals(sns.rugplot, height=0.12)
g.fig.suptitle("Practice Count And Quiz Score Density", y=1.03)
plt.show()This is more specialized than a normal scatter plot. Use it when distribution shape is the main story.
28. Utility Functions
Seaborn includes helper functions for datasets and styling.
print(sns.get_dataset_names())For this guide, you should not load those datasets. They are useful for quick experiments, but original CSVs are better for a copyright-safe blog post.
You can inspect available palettes:
print(sns.color_palette("Set2"))And preview a palette:
sns.palplot(sns.color_palette("Set2"))
plt.show()29. When To Use Each Grid
| Tool | Use It When | Avoid It When |
|---|---|---|
catplot | one categorical chart needs facets | one simple axes-level chart is enough |
FacetGrid | you need custom mapping across rows and columns | a figure-level function already solves it |
pairplot | you want fast multivariate exploration | you need a polished final chart |
PairGrid | each grid region needs different plot types | the dataset has too many columns |
jointplot | one bivariate relationship plus margins is enough | you need custom marginal functions |
JointGrid | you need custom joint and marginal plots | a simple scatter plot answers the question |
30. Common Mistakes
Using bar plots for everything
Bar plots hide spread. If distribution matters, use box, violin, strip, or swarm plots.
Using swarm plots on large datasets
Swarm plots are readable for small datasets. For large data, use strip plots with transparency, box plots, violin plots, or sampled data.
Overloading hue
Too many hue categories make legends and colors difficult to interpret. Use facets or filtering instead.
Trusting a regression line too quickly
Always inspect the scatter points and residuals. A clean line can hide nonlinear patterns.
Building pair grids with too many variables
Five numeric variables already create many panels. Start small.
31. Practice Tasks
Try these tasks with the included CSV files:
- Draw a strip plot of
attendance_pctbycohort, colored bycompleted. - Draw a swarm plot of
satisfactionbylevel, colored bydelivery_mode. - Create a box plot of
project_scorebymentor_group. - Create a violin plot of
quiz_scorebytrack, split bycompleted. - Create a bar plot showing median
support_ticketsbytrack. - Create a point plot of
attendance_pctacross cohorts for each track. - Use
catplotto facet project score box plots by delivery mode. - Use
lmplotto comparestudy_hoursandproject_scoreby track. - Use
FacetGridto create one scatter panel per cohort. - Use
PairGridwith histograms on the diagonal and scatter plots off the diagonal. - Use
jointplot(kind="reg")forpractice_countvsfinal_score. - Use
JointGridto combine a KDE joint plot with histogram margins.
32. Interview-Style Questions
What is the difference between stripplot and swarmplot?
Both show individual points by category. stripplot adds jitter to reduce overlap. swarmplot arranges points to avoid overlap more carefully, but it can become slow or crowded.
When should you use a box plot?
Use a box plot when you want to compare median, spread, and possible outliers across categories.
What does a violin plot add?
A violin plot adds an estimated density shape, so you can see whether values are concentrated in one area or spread across several areas.
What is the difference between barplot and countplot?
barplot summarizes a numeric column for each category. countplot counts how many rows are in each category.
Why use catplot?
Use catplot when you want a categorical chart split into multiple facets with row, col, or col_wrap.
What is the difference between regplot and lmplot?
regplot is axes-level and good for one panel. lmplot is figure-level and supports grouping and faceting.
Why use FacetGrid?
Use FacetGrid when you need more custom control than relplot, displot, catplot, or lmplot provide.
What is the difference between pairplot and PairGrid?
pairplot is quick and convenient. PairGrid gives you manual control over diagonal, upper, and lower panels.
What is the difference between jointplot and JointGrid?
jointplot is a convenient shortcut. JointGrid lets you choose separate plot functions for the joint area and marginal axes.
33. Final Checklist
Before moving on, make sure you can:
- choose between raw-point, distribution, estimate, and count plots
- use
catplotfor categorical facets - use
regplot,lmplot, andresidplotresponsibly - build a custom
FacetGrid - use
pairplotfor quick exploration - customize a
PairGrid - use
jointplotfor a bivariate relationship with margins - customize a
JointGrid - keep grid charts small enough to read
- explain why a chart type fits the question
Advanced Seaborn is less about memorizing function names and more about matching chart structure to analysis structure.
If the question has categories, start with categorical plots.
If the question has a relationship, start with scatter or regression.
If the question has many repeated comparisons, use a grid only when it improves clarity.
