# Create an Interactive District Dashboard for India with Plotly
URL: https://madhudadi.in/blog/posts/interactive-india-district-dashboard-with-plotly-streamlit
Published: 2026-06-14
Tags: Plotly, python, Streamlit
Read time: 35 min
Difficulty: intermediate
> Build an original Plotly project that maps district-level indicators, lets users choose metrics, filters by state, and turns the analysis into a small Streamlit dashboard.# Build an Interactive India District Dashboard with Plotly and Streamlit

Plotly is useful when a static chart is not enough.

In this project, you will build an interactive district-level dashboard for India-style geographic data. The dashboard lets a user:

- choose a state or view all districts
- choose one metric for marker size
- choose another metric for marker color
- inspect each district by hovering over the map
- run the result as a Streamlit web app

This is an original teaching project. The sample CSV included with this guide uses approximate locations and synthetic metrics for practice. It is not copied from a course notebook or proprietary dataset.

## Files Used In This Guide

Use this CSV file:

- `plotly_india_district_sample.csv`

Place it in the same folder as your notebook or script.

If you keep it in a `data/` folder, load it like this:

```python
df = pd.read_csv("data/plotly_india_district_sample.csv")
```

## What You Will Build

By the end, you will have:

- a cleaned district metrics table
- a reusable `plot_district_map()` function
- an all-India interactive bubble map
- a state-filtered map
- a Streamlit dashboard with sidebar controls

The final app will use Plotly Express and Streamlit:

```bash
pip install pandas plotly streamlit
```

## 1. Load The Dataset

Start with Pandas and Plotly Express:

```python
import pandas as pd
import plotly.express as px
```

Load the CSV:

```python
df = pd.read_csv("plotly_india_district_sample.csv")

print(df.head())
print(df.shape)
```

Expected columns include:

- `State`
- `District`
- `Latitude`
- `Longitude`
- `Population`
- `Households`
- `Households_with_Internet`
- `Households_with_Computer`
- `Housholds_with_Electric_Lighting`
- `Workers`
- `sex_ratio`
- `literacy_rate`
- `internet_household_pct`
- `urban_household_pct`

The column `Housholds_with_Electric_Lighting` keeps the same misspelling that often appears in raw public data extracts. In a real project, you may rename it. In this tutorial, we keep it visible so you learn how to handle imperfect source schemas.

## 2. Validate The Data

Before plotting, check whether the map has valid coordinates and numeric metrics.

```python
required_columns = [
    "State",
    "District",
    "Latitude",
    "Longitude",
    "Population",
    "Households",
    "Households_with_Internet",
    "Households_with_Computer",
    "Housholds_with_Electric_Lighting",
    "Workers",
    "sex_ratio",
    "literacy_rate",
    "internet_household_pct",
    "urban_household_pct",
]

missing_columns = [col for col in required_columns if col not in df.columns]
print("Missing columns:", missing_columns)

print(df[["Latitude", "Longitude"]].isna().sum())
print(df.duplicated(subset=["State", "District"]).sum())
```

For this sample dataset, you should see:

- no missing required columns
- no missing coordinates
- no duplicate state-district pairs

## 3. Create A Basic District Map

Plotly can draw points on map tiles with `px.scatter_mapbox`.

```python
fig = px.scatter_mapbox(
    df,
    lat="Latitude",
    lon="Longitude",
    hover_name="District",
    hover_data=["State", "Population", "literacy_rate"],
    zoom=3.8,
    height=650,
    mapbox_style="carto-positron",
    title="District Sample Map",
)

fig.show()
```

The `carto-positron` style works without a Mapbox token, which makes it convenient for notebooks and small teaching apps.

## 4. Encode Population With Marker Size

A dashboard becomes more useful when visual properties carry meaning.

Here, marker size represents population:

```python
fig = px.scatter_mapbox(
    df,
    lat="Latitude",
    lon="Longitude",
    size="Population",
    size_max=35,
    hover_name="District",
    hover_data=["State", "Population"],
    zoom=3.8,
    height=650,
    mapbox_style="carto-positron",
    title="Population By District",
)

fig.show()
```

Use `size_max` to prevent the largest districts from covering the whole map.

## 5. Add Color For A Second Metric

Now encode `literacy_rate` with color:

```python
fig = px.scatter_mapbox(
    df,
    lat="Latitude",
    lon="Longitude",
    size="Population",
    color="literacy_rate",
    size_max=35,
    color_continuous_scale="Viridis",
    hover_name="District",
    hover_data={
        "State": True,
        "Population": ":,",
        "literacy_rate": ":.1f",
        "Latitude": False,
        "Longitude": False,
    },
    zoom=3.8,
    height=650,
    mapbox_style="carto-positron",
    title="Population Size And Literacy Rate Color",
)

fig.show()
```

This creates a two-metric visualization:

- larger bubbles mean larger population
- brighter or darker colors show literacy differences

## 6. Filter To One State

Dashboards usually need filters.

Filter the data to one state:

```python
state_name = "Maharashtra"
state_df = df[df["State"] == state_name]

fig = px.scatter_mapbox(
    state_df,
    lat="Latitude",
    lon="Longitude",
    size="Population",
    color="internet_household_pct",
    size_max=35,
    color_continuous_scale="Plasma",
    hover_name="District",
    hover_data=["State", "Population", "Households_with_Internet"],
    zoom=5.5,
    height=650,
    mapbox_style="carto-positron",
    title=f"Internet Access Sample Metrics In {state_name}",
)

fig.show()
```

The same charting logic works for a national view and a state-level view.

## 7. Build A Reusable Plot Function

Instead of rewriting the same Plotly call, create a function.

```python
def plot_district_map(data, primary_metric, secondary_metric, title, zoom):
    fig = px.scatter_mapbox(
        data,
        lat="Latitude",
        lon="Longitude",
        size=primary_metric,
        color=secondary_metric,
        size_max=35,
        color_continuous_scale="Viridis",
        hover_name="District",
        hover_data={
            "State": True,
            primary_metric: ":,",
            secondary_metric: ":.2f" if "pct" in secondary_metric or "rate" in secondary_metric else ":,",
            "Latitude": False,
            "Longitude": False,
        },
        zoom=zoom,
        height=700,
        mapbox_style="carto-positron",
        title=title,
    )

    fig.update_layout(
        margin={"r": 0, "t": 50, "l": 0, "b": 0},
        coloraxis_colorbar_title=secondary_metric.replace("_", " ").title(),
    )

    return fig
```

Test it:

```python
fig = plot_district_map(
    df,
    primary_metric="Population",
    secondary_metric="literacy_rate",
    title="District Population And Literacy Rate",
    zoom=3.8,
)

fig.show()
```

This function is the heart of the dashboard.

## 8. Choose Metrics Programmatically

Create a list of numeric columns users can select.

```python
protected_columns = {"Latitude", "Longitude"}

numeric_metrics = [
    col
    for col in df.select_dtypes(include="number").columns
    if col not in protected_columns
]

print(numeric_metrics)
```

Example output:

```text
['Population', 'Male_Literate', 'Female_Literate', 'Households_with_Internet', ...]
```

This lets the dashboard remain flexible even if you add more metrics later.

## 9. Build The Streamlit App

Create a file named `app.py`:

```python
import pandas as pd
import plotly.express as px
import streamlit as st


@st.cache_data
def load_data():
    return pd.read_csv("plotly_india_district_sample.csv")


def plot_district_map(data, primary_metric, secondary_metric, title, zoom):
    fig = px.scatter_mapbox(
        data,
        lat="Latitude",
        lon="Longitude",
        size=primary_metric,
        color=secondary_metric,
        size_max=35,
        color_continuous_scale="Viridis",
        hover_name="District",
        hover_data={
            "State": True,
            primary_metric: ":,",
            secondary_metric: ":.2f" if "pct" in secondary_metric or "rate" in secondary_metric else ":,",
            "Latitude": False,
            "Longitude": False,
        },
        zoom=zoom,
        height=700,
        mapbox_style="carto-positron",
        title=title,
    )

    fig.update_layout(margin={"r": 0, "t": 50, "l": 0, "b": 0})
    return fig


df = load_data()

st.set_page_config(page_title="India District Metrics Dashboard", layout="wide")
st.title("India District Metrics Dashboard")
st.caption("Original practice dataset with approximate locations and synthetic indicators.")

states = ["Overall India"] + sorted(df["State"].unique())

numeric_metrics = [
    col
    for col in df.select_dtypes(include="number").columns
    if col not in {"Latitude", "Longitude"}
]

selected_state = st.sidebar.selectbox("Select a state", states)
primary_metric = st.sidebar.selectbox("Marker size metric", numeric_metrics, index=numeric_metrics.index("Population"))
secondary_metric = st.sidebar.selectbox("Marker color metric", numeric_metrics, index=numeric_metrics.index("literacy_rate"))

if selected_state == "Overall India":
    filtered = df
    zoom = 3.8
    title = f"All Districts: Size = {primary_metric}, Color = {secondary_metric}"
else:
    filtered = df[df["State"] == selected_state]
    zoom = 5.2
    title = f"{selected_state}: Size = {primary_metric}, Color = {secondary_metric}"

left, right = st.columns([3, 1])

with left:
    fig = plot_district_map(filtered, primary_metric, secondary_metric, title, zoom)
    st.plotly_chart(fig, use_container_width=True)

with right:
    st.subheader("Selected Data")
    st.metric("Districts", len(filtered))
    st.metric("Total Population", f"{int(filtered['Population'].sum()):,}")
    st.metric("Avg Literacy Rate", f"{filtered['literacy_rate'].mean():.1f}%")
    st.metric("Avg Internet Household %", f"{filtered['internet_household_pct'].mean():.1f}%")

st.dataframe(
    filtered[["State", "District", primary_metric, secondary_metric]].sort_values(primary_metric, ascending=False),
    use_container_width=True,
)
```

Run the app:

```bash
streamlit run app.py
```

## 10. How The Dashboard Works

The dashboard has three layers:

1. Data loading
2. User controls
3. Plot rendering

The data loading layer uses `@st.cache_data` so Streamlit does not reread the CSV on every interaction.

The sidebar controls change:

- which rows are displayed
- which column controls marker size
- which column controls marker color

The Plotly function receives those choices and returns a new figure.

## 11. Improve The Hover Tooltip

Hover labels are where interactive charts become useful.

Try adding more context:

```python
fig = px.scatter_mapbox(
    filtered,
    lat="Latitude",
    lon="Longitude",
    size=primary_metric,
    color=secondary_metric,
    hover_name="District",
    hover_data={
        "State": True,
        "Population": ":,",
        "Households": ":,",
        "literacy_rate": ":.1f",
        "internet_household_pct": ":.1f",
        "urban_household_pct": ":.1f",
        "Latitude": False,
        "Longitude": False,
    },
    mapbox_style="carto-positron",
)
```

Hide coordinates unless they are analytically useful. Most users care more about the district name and metrics than raw latitude and longitude.

## 12. Add A Ranking Chart

A map answers "where".

A bar chart answers "who is highest or lowest".

Add a ranking chart below the map:

```python
top_districts = filtered.nlargest(10, secondary_metric)

bar_fig = px.bar(
    top_districts,
    x=secondary_metric,
    y="District",
    color=secondary_metric,
    orientation="h",
    title=f"Top Districts By {secondary_metric}",
)

bar_fig.update_layout(yaxis={"categoryorder": "total ascending"})
st.plotly_chart(bar_fig, use_container_width=True)
```

This turns the app from a map-only demo into a small analytical dashboard.

## 13. Common Problems

### The map does not display

Check these items:

- `Latitude` and `Longitude` are numeric
- the values are not missing
- the map style is token-free, such as `carto-positron`

### Markers are too large

Lower `size_max`:

```python
size_max=20
```
**Explanation**

- Defines a constant variable size_max with value 20 that likely represents an upper limit for data structures or processing operations
- This variable can be used throughout the codebase to maintain consistent sizing constraints without hardcoding the number 20 in multiple locations
- The naming convention suggests this is part of a configuration or parameter setup phase in a program
- Commonly used in scenarios like array bounds checking, buffer size limitations, or iterative process termination conditions
- This approach improves code maintainability by centralizing the maximum size value in one location


### The app reruns too often

Streamlit reruns the script when a widget changes. Use `@st.cache_data` for loading data and keep expensive transformations inside cached functions.

### The chart has too many hover fields

Use `hover_data` to hide columns:

```python
hover_data={"Latitude": False, "Longitude": False}
```
**Explanation**

- This code snippet defines a dictionary that controls which data fields appear when hovering over elements in Plotly visualizations
- The keys "Latitude" and "Longitude" are set to False, meaning these coordinates won't be displayed during hover interactions
- This approach helps reduce clutter in tooltips by hiding less essential coordinate information while keeping other data visible
- Commonly used in dash applications to create cleaner, more focused user interfaces for geographic data visualization
- The configuration can be passed to Plotly's hover_data parameter to customize interactive tooltip behavior


## 14. Practice Tasks

Try these improvements:

1. Add a dropdown that switches the color scale between `Viridis`, `Plasma`, and `Turbo`.
2. Add a checkbox to show only districts above a selected literacy rate.
3. Add a slider for minimum population.
4. Create a second tab for ranking charts.
5. Add a download button for the filtered data.

Example filter:

```python
minimum_population = st.sidebar.slider(
    "Minimum population",
    min_value=int(df["Population"].min()),
    max_value=int(df["Population"].max()),
    value=int(df["Population"].min()),
)

filtered = filtered[filtered["Population"] >= minimum_population]
```
**Explanation**

- Creates an interactive slider widget in Streamlit sidebar to select minimum population threshold
- Sets slider range from the minimum to maximum population values in the dataset
- Initializes slider to the minimum population value by default
- Filters the dataset to include only rows where population meets or exceeds the selected minimum threshold
- Updates the filtered dataframe based on user selection for downstream analysis or visualization


## Final Takeaway

This project teaches a practical visualization workflow:

- clean and validate tabular data
- map numeric columns to visual encodings
- use hover labels for context
- wrap Plotly logic in reusable functions
- expose filters through Streamlit controls

That pattern is reusable for many real dashboards: public datasets, operations monitoring, sales territories, service locations, logistics, and education analytics.