# Mastering Seaborn: Categorical and Regression Plots Explained
URL: https://madhudadi.in/blog/posts/mastering-seaborn-categorical-and-regression-plots-explained
Published: 2026-06-21
Tags: python, Seaborn
Read time: 45 min
Difficulty: intermediate
> Go deeper with Seaborn using original datasets: strip plots, swarm plots, box plots, violin plots, bar plots, point plots, count plots, catplot facets, regplot, lmplot, residplot, FacetGrid, PairGrid, pairplot, jointplot, and JointGrid.# Seaborn Categorical Plots, Regression Plots, FacetGrid, PairGrid, and JointGrid

The first Seaborn guide introduced the main plotting families.

This guide goes deeper into the plots you use when your analysis has categories, statistical summaries, trend lines, and multi-panel grids.

You will work with original synthetic bootcamp datasets. The examples do not use Seaborn built-in datasets, copied course notebooks, proprietary spreadsheets, or public exercise data.

## Files Used In This Guide

Use these CSV files:

- `seaborn_bootcamp_sessions.csv`
- `seaborn_assessment_diagnostics.csv`

Place them in the same folder as your notebook or script.

If you keep them in a `data/` folder, change the paths:

```python
bootcamp = pd.read_csv("data/seaborn_bootcamp_sessions.csv")
diagnostics = pd.read_csv("data/seaborn_assessment_diagnostics.csv")
```

## What You Will Learn

By the end, you should be able to:

- choose the right categorical plot for raw points, spread, or summary
- use `stripplot` and `swarmplot` for individual observations
- use `boxplot` and `violinplot` for distribution comparison
- use `barplot`, `pointplot`, and `countplot` for estimates and counts
- convert axes-level categorical plots into `catplot` facets
- build regression charts with `regplot`, `lmplot`, and `residplot`
- create custom small multiples with `FacetGrid`
- compare `pairplot` and `PairGrid`
- compare `jointplot` and `JointGrid`
- decide when grids help and when they create visual noise

## 1. Setup

Install the required packages:

```bash
pip install pandas matplotlib seaborn
```

Import the libraries:

```python
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
```

Set a readable default theme:

```python
sns.set_theme(style="whitegrid", context="notebook")
```

Load the CSV files:

```python
bootcamp = pd.read_csv("seaborn_bootcamp_sessions.csv")
diagnostics = pd.read_csv("seaborn_assessment_diagnostics.csv")

print(bootcamp.head())
print(diagnostics.head())
```

The `bootcamp` table has one row per learning session. The `diagnostics` table has one row per learner assessment.

## 2. Categorical Plot Families

Seaborn categorical plots answer different questions.

| Plot Type | Function | Best Question |
|---|---|---|
| Categorical scatter | `stripplot`, `swarmplot` | What do individual points look like by group? |
| Distribution by category | `boxplot`, `violinplot` | How does spread differ by group? |
| Estimate by category | `barplot`, `pointplot` | What is the average or summary value by group? |
| Count by category | `countplot` | How many rows are in each group? |
| Figure-level categorical | `catplot` | How can I facet categorical plots? |

Start with the question, not the chart name.

## 3. Strip Plot

A strip plot shows individual observations grouped by category.

```python
plt.figure(figsize=(9, 5))

sns.stripplot(
    data=bootcamp,
    x="track",
    y="project_score",
    jitter=True,
    alpha=0.8,
)

plt.title("Project Scores By Track")
plt.xlabel("Track")
plt.ylabel("Project Score")
plt.show()
```

`jitter=True` spreads points horizontally so overlapping observations are easier to see.

Use a strip plot when the dataset is small enough that individual points matter.

## 4. Strip Plot With Hue

Add `hue` to compare another category.

```python
plt.figure(figsize=(10, 5))

sns.stripplot(
    data=bootcamp,
    x="track",
    y="project_score",
    hue="completed",
    jitter=True,
    dodge=True,
    alpha=0.75,
)

plt.title("Project Scores By Track And Completion Status")
plt.xlabel("Track")
plt.ylabel("Project Score")
plt.legend(title="Completed")
plt.show()
```

`dodge=True` separates the hue groups inside each category.

## 5. Swarm Plot

A swarm plot also shows individual points, but it arranges them to reduce overlap.

```python
plt.figure(figsize=(10, 5))

sns.swarmplot(
    data=bootcamp,
    x="level",
    y="quiz_score",
    hue="delivery_mode",
)

plt.title("Quiz Scores By Level And Delivery Mode")
plt.xlabel("Level")
plt.ylabel("Quiz Score")
plt.show()
```

Swarm plots are useful for small datasets. They can become crowded or slow with many rows.

## 6. Catplot For Categorical Scatter

`catplot` is the figure-level categorical function.

```python
g = sns.catplot(
    data=bootcamp,
    x="track",
    y="project_score",
    kind="strip",
    hue="completed",
    col="cohort",
    col_wrap=2,
    jitter=True,
    height=3.5,
)

g.set_axis_labels("Track", "Project Score")
g.set_titles("Cohort: {col_name}")
g.fig.suptitle("Project Scores Across Cohorts", y=1.04)
plt.show()
```

Use `catplot` when the same categorical chart should be repeated across cohorts, levels, regions, or other grouping columns.

## 7. Box Plot

A box plot summarizes the distribution of a numeric value in each category.

```python
plt.figure(figsize=(9, 5))

sns.boxplot(
    data=bootcamp,
    x="track",
    y="study_hours",
)

plt.title("Study Hours By Track")
plt.xlabel("Track")
plt.ylabel("Study Hours")
plt.show()
```

A box plot is good when you care about:

- median
- spread
- skew
- possible outliers
- group-to-group comparison

It hides individual points, so consider overlaying a strip plot if the dataset is small.

```python
plt.figure(figsize=(9, 5))

sns.boxplot(
    data=bootcamp,
    x="track",
    y="study_hours",
    color="lightgray",
)

sns.stripplot(
    data=bootcamp,
    x="track",
    y="study_hours",
    color="black",
    alpha=0.55,
    jitter=True,
)

plt.title("Study Hours By Track With Individual Points")
plt.xlabel("Track")
plt.ylabel("Study Hours")
plt.show()
```

## 8. Box Plot With Hue

Use `hue` when the distribution needs a second grouping.

```python
plt.figure(figsize=(10, 5))

sns.boxplot(
    data=bootcamp,
    x="track",
    y="quiz_score",
    hue="delivery_mode",
)

plt.title("Quiz Score Spread By Track And Delivery Mode")
plt.xlabel("Track")
plt.ylabel("Quiz Score")
plt.show()
```

Keep hue categories limited. Too many groups make box plots hard to read.

## 9. Violin Plot

A violin plot shows distribution shape using a density estimate.

```python
plt.figure(figsize=(10, 5))

sns.violinplot(
    data=bootcamp,
    x="track",
    y="project_score",
    hue="completed",
    split=True,
    inner="quart",
)

plt.title("Project Score Distribution By Track")
plt.xlabel("Track")
plt.ylabel("Project Score")
plt.show()
```

`split=True` works when `hue` has two groups. It places both distributions inside one violin for each x-category.

Use violin plots when distribution shape matters. Use box plots when the audience needs a simpler summary.

## 10. Bar Plot

A bar plot estimates a summary value for each category.

```python
plt.figure(figsize=(9, 5))

sns.barplot(
    data=bootcamp,
    x="track",
    y="satisfaction",
    hue="level",
    errorbar=None,
)

plt.title("Average Satisfaction By Track And Level")
plt.xlabel("Track")
plt.ylabel("Average Satisfaction")
plt.show()
```

By default, Seaborn uses the mean.

Use `errorbar=None` only when you intentionally want a clean teaching chart. In real reporting, error bars can communicate uncertainty.

## 11. Custom Estimator In A Bar Plot

You can summarize with a function other than the mean.

```python
import numpy as np

plt.figure(figsize=(9, 5))

sns.barplot(
    data=bootcamp,
    x="track",
    y="support_tickets",
    hue="delivery_mode",
    estimator=np.median,
    errorbar=None,
)

plt.title("Median Support Tickets By Track")
plt.xlabel("Track")
plt.ylabel("Median Support Tickets")
plt.show()
```

Median is often better than mean when the metric has outliers.

## 12. Point Plot

A point plot compares estimates while emphasizing change across categories.

```python
sns.pointplot(
    data=bootcamp,
    x="cohort",
    y="attendance_pct",
    hue="track",
    errorbar=None,
    markers="o",
    linestyles="-",
)

plt.title("Average Attendance Across Cohorts")
plt.xlabel("Cohort")
plt.ylabel("Average Attendance (%)")
plt.show()
```

Point plots are useful when the x-axis has a meaningful order, such as cohorts, months, or levels.

Do not use connecting lines when the x-axis categories are unordered.

## 13. Count Plot

A count plot counts rows in each category.

```python
sns.countplot(
    data=bootcamp,
    x="track",
    hue="completed",
)

plt.title("Session Count By Track And Completion")
plt.xlabel("Track")
plt.ylabel("Number Of Sessions")
plt.show()
```

Use `countplot` when you need frequency, not a numeric summary.

## 14. Catplot With Box Facets

`catplot` can create faceted box plots.

```python
g = sns.catplot(
    data=bootcamp,
    x="level",
    y="project_score",
    hue="completed",
    col="track",
    col_wrap=3,
    kind="box",
    height=3.5,
)

g.set_axis_labels("Level", "Project Score")
g.set_titles("{col_name}")
g.fig.suptitle("Project Score Spread By Track And Level", y=1.04)
plt.show()
```

This is easier to read than putting every track, level, and completion group into one overloaded chart.

## 15. Regression With Regplot

`regplot` draws a scatter plot plus a fitted trend line.

```python
sns.regplot(
    data=bootcamp,
    x="study_hours",
    y="project_score",
    scatter_kws={"alpha": 0.75},
    line_kws={"color": "red"},
)

plt.title("Study Hours vs Project Score")
plt.xlabel("Study Hours")
plt.ylabel("Project Score")
plt.show()
```

Use `regplot` for a quick single-panel trend check.

## 16. Regression With Lmplot And Hue

`lmplot` is the figure-level regression function.

```python
g = sns.lmplot(
    data=bootcamp,
    x="study_hours",
    y="project_score",
    hue="completed",
    height=5,
    aspect=1.2,
    scatter_kws={"alpha": 0.75},
)

g.set_axis_labels("Study Hours", "Project Score")
g.fig.suptitle("Study Hours And Project Score By Completion", y=1.03)
plt.show()
```

Use `lmplot` when you need groups or facets.

## 17. Regression Facets

Split the regression by track:

```python
g = sns.lmplot(
    data=bootcamp,
    x="study_hours",
    y="quiz_score",
    col="track",
    hue="delivery_mode",
    col_wrap=3,
    height=3.5,
    scatter_kws={"alpha": 0.75},
)

g.set_axis_labels("Study Hours", "Quiz Score")
g.set_titles("{col_name}")
g.fig.suptitle("Study Hours And Quiz Score By Track", y=1.04)
plt.show()
```

Faceted regression is helpful when a single line would hide group-specific behavior.

## 18. Residual Plot

A residual plot shows what remains after fitting the model.

```python
sns.residplot(
    data=bootcamp,
    x="study_hours",
    y="project_score",
    lowess=True,
)

plt.title("Residual Pattern For Study Hours And Project Score")
plt.xlabel("Study Hours")
plt.ylabel("Residual")
plt.show()
```

If the residuals have a strong curve or pattern, a simple straight-line trend may not describe the relationship well.

## 19. FacetGrid

`FacetGrid` gives you manual control over small multiples.

```python
g = sns.FacetGrid(
    data=bootcamp,
    row="delivery_mode",
    col="track",
    hue="completed",
    margin_titles=True,
    height=3,
)

g.map_dataframe(
    sns.scatterplot,
    x="study_hours",
    y="project_score",
    alpha=0.8,
)

g.add_legend(title="Completed")
g.set_axis_labels("Study Hours", "Project Score")
g.fig.suptitle("Study vs Project Score By Track And Delivery Mode", y=1.03)
plt.show()
```

Use `FacetGrid` when built-in figure-level functions do not give you enough control.

## 20. FacetGrid With A Categorical Plot

You can map categorical functions too.

```python
g = sns.FacetGrid(
    data=bootcamp,
    col="cohort",
    col_wrap=2,
    height=3.5,
    sharey=True,
)

g.map_dataframe(
    sns.boxplot,
    x="track",
    y="project_score",
    order=["Analytics", "Python", "ML"],
)

g.set_axis_labels("Track", "Project Score")
g.set_titles("Cohort: {col_name}")

for ax in g.axes.flat:
    ax.tick_params(axis="x", rotation=25)

g.fig.suptitle("Project Score Spread By Cohort", y=1.04)
plt.show()
```

When mapping categorical plots manually, pass a consistent `order` so categories appear in the same sequence in every panel.

## 21. Pairplot

`pairplot` creates a quick grid of pairwise relationships.

```python
cols = [
    "profile",
    "study_hours",
    "practice_count",
    "project_score",
    "quiz_score",
    "final_score",
]

sns.pairplot(
    data=diagnostics[cols],
    hue="profile",
    corner=True,
    plot_kws={"alpha": 0.75},
)

plt.show()
```

Use `pairplot` for exploration. It is not always the best final chart because it can show too much at once.

## 22. PairGrid

`PairGrid` is the customizable version of `pairplot`.

```python
grid_cols = [
    "study_hours",
    "practice_count",
    "project_score",
    "quiz_score",
    "final_score",
]

g = sns.PairGrid(
    data=diagnostics,
    vars=grid_cols,
    hue="profile",
    corner=False,
)

g.map_diag(sns.histplot)
g.map_upper(sns.scatterplot, alpha=0.75)
g.map_lower(sns.kdeplot, fill=True, alpha=0.35)
g.add_legend(title="Profile")

plt.show()
```

Use `PairGrid` when you want different plots on the diagonal, upper triangle, and lower triangle.

## 23. PairGrid Design Tips

Pair grids can become unreadable quickly.

Use these rules:

- Keep the numeric column list short.
- Use `corner=True` when the mirrored half is redundant.
- Use transparency on scatter points.
- Avoid heavy KDE plots for very large datasets.
- Use a clear `hue` column with only a few categories.

## 24. Jointplot

`jointplot` shows the relationship between two variables plus their marginal distributions.

```python
sns.jointplot(
    data=diagnostics,
    x="study_hours",
    y="final_score",
    hue="profile",
    kind="scatter",
    height=6,
)

plt.show()
```

This is useful when you want the bivariate relationship and each variable's distribution in one figure.

## 25. Jointplot Kinds

Try different `kind` values depending on the question.

```python
sns.jointplot(
    data=diagnostics,
    x="practice_count",
    y="final_score",
    kind="reg",
    height=6,
)

plt.show()
```

```python
sns.jointplot(
    data=diagnostics,
    x="study_hours",
    y="response_time_min",
    kind="hex",
    height=6,
)

plt.show()
```

Common choices:

- `scatter` for raw points
- `reg` for a trend line
- `hist` for binned density
- `kde` for smooth density
- `hex` for dense point clouds
- `resid` for residual pattern

## 26. JointGrid

`JointGrid` gives you more manual control than `jointplot`.

```python
g = sns.JointGrid(
    data=diagnostics,
    x="study_hours",
    y="final_score",
    height=6,
)

g.plot_joint(sns.scatterplot, alpha=0.8)
g.plot_marginals(sns.histplot, bins=8, kde=True)

g.fig.suptitle("Study Hours And Final Score", y=1.03)
plt.show()
```

Use `JointGrid` when you want to choose separate functions for the joint area and the margins.

## 27. JointGrid With KDE And Rug Marks

You can combine different plot types.

```python
g = sns.JointGrid(
    data=diagnostics,
    x="practice_count",
    y="quiz_score",
    height=6,
)

g.plot_joint(sns.kdeplot, fill=True, cmap="Blues")
g.plot_marginals(sns.rugplot, height=0.12)

g.fig.suptitle("Practice Count And Quiz Score Density", y=1.03)
plt.show()
```

This is more specialized than a normal scatter plot. Use it when distribution shape is the main story.

## 28. Utility Functions

Seaborn includes helper functions for datasets and styling.

```python
print(sns.get_dataset_names())
```

For this guide, you should not load those datasets. They are useful for quick experiments, but original CSVs are better for a copyright-safe blog post.

You can inspect available palettes:

```python
print(sns.color_palette("Set2"))
```

And preview a palette:

```python
sns.palplot(sns.color_palette("Set2"))
plt.show()
```

## 29. When To Use Each Grid

| Tool | Use It When | Avoid It When |
|---|---|---|
| `catplot` | one categorical chart needs facets | one simple axes-level chart is enough |
| `FacetGrid` | you need custom mapping across rows and columns | a figure-level function already solves it |
| `pairplot` | you want fast multivariate exploration | you need a polished final chart |
| `PairGrid` | each grid region needs different plot types | the dataset has too many columns |
| `jointplot` | one bivariate relationship plus margins is enough | you need custom marginal functions |
| `JointGrid` | you need custom joint and marginal plots | a simple scatter plot answers the question |

## 30. Common Mistakes

### Using bar plots for everything

Bar plots hide spread. If distribution matters, use box, violin, strip, or swarm plots.

### Using swarm plots on large datasets

Swarm plots are readable for small datasets. For large data, use strip plots with transparency, box plots, violin plots, or sampled data.

### Overloading hue

Too many hue categories make legends and colors difficult to interpret. Use facets or filtering instead.

### Trusting a regression line too quickly

Always inspect the scatter points and residuals. A clean line can hide nonlinear patterns.

### Building pair grids with too many variables

Five numeric variables already create many panels. Start small.

## 31. Practice Tasks

Try these tasks with the included CSV files:

1. Draw a strip plot of `attendance_pct` by `cohort`, colored by `completed`.
2. Draw a swarm plot of `satisfaction` by `level`, colored by `delivery_mode`.
3. Create a box plot of `project_score` by `mentor_group`.
4. Create a violin plot of `quiz_score` by `track`, split by `completed`.
5. Create a bar plot showing median `support_tickets` by `track`.
6. Create a point plot of `attendance_pct` across cohorts for each track.
7. Use `catplot` to facet project score box plots by delivery mode.
8. Use `lmplot` to compare `study_hours` and `project_score` by track.
9. Use `FacetGrid` to create one scatter panel per cohort.
10. Use `PairGrid` with histograms on the diagonal and scatter plots off the diagonal.
11. Use `jointplot(kind="reg")` for `practice_count` vs `final_score`.
12. Use `JointGrid` to combine a KDE joint plot with histogram margins.

## 32. Interview-Style Questions

### What is the difference between `stripplot` and `swarmplot`?

Both show individual points by category. `stripplot` adds jitter to reduce overlap. `swarmplot` arranges points to avoid overlap more carefully, but it can become slow or crowded.

### When should you use a box plot?

Use a box plot when you want to compare median, spread, and possible outliers across categories.

### What does a violin plot add?

A violin plot adds an estimated density shape, so you can see whether values are concentrated in one area or spread across several areas.

### What is the difference between `barplot` and `countplot`?

`barplot` summarizes a numeric column for each category. `countplot` counts how many rows are in each category.

### Why use `catplot`?

Use `catplot` when you want a categorical chart split into multiple facets with `row`, `col`, or `col_wrap`.

### What is the difference between `regplot` and `lmplot`?

`regplot` is axes-level and good for one panel. `lmplot` is figure-level and supports grouping and faceting.

### Why use `FacetGrid`?

Use `FacetGrid` when you need more custom control than `relplot`, `displot`, `catplot`, or `lmplot` provide.

### What is the difference between `pairplot` and `PairGrid`?

`pairplot` is quick and convenient. `PairGrid` gives you manual control over diagonal, upper, and lower panels.

### What is the difference between `jointplot` and `JointGrid`?

`jointplot` is a convenient shortcut. `JointGrid` lets you choose separate plot functions for the joint area and marginal axes.

## 33. Final Checklist

Before moving on, make sure you can:

- choose between raw-point, distribution, estimate, and count plots
- use `catplot` for categorical facets
- use `regplot`, `lmplot`, and `residplot` responsibly
- build a custom `FacetGrid`
- use `pairplot` for quick exploration
- customize a `PairGrid`
- use `jointplot` for a bivariate relationship with margins
- customize a `JointGrid`
- keep grid charts small enough to read
- explain why a chart type fits the question

Advanced Seaborn is less about memorizing function names and more about matching chart structure to analysis structure.

If the question has categories, start with categorical plots.

If the question has a relationship, start with scatter or regression.

If the question has many repeated comparisons, use a grid only when it improves clarity.
