# Create an Interactive District Dashboard for India with Plotly URL: https://madhudadi.in/blog/posts/interactive-india-district-dashboard-with-plotly-streamlit Published: 2026-06-14 Tags: Plotly, python, Streamlit Read time: 35 min Difficulty: intermediate > Build an original Plotly project that maps district-level indicators, lets users choose metrics, filters by state, and turns the analysis into a small Streamlit dashboard.# Build an Interactive India District Dashboard with Plotly and Streamlit Plotly is useful when a static chart is not enough. In this project, you will build an interactive district-level dashboard for India-style geographic data. The dashboard lets a user: - choose a state or view all districts - choose one metric for marker size - choose another metric for marker color - inspect each district by hovering over the map - run the result as a Streamlit web app This is an original teaching project. The sample CSV included with this guide uses approximate locations and synthetic metrics for practice. It is not copied from a course notebook or proprietary dataset. ## Files Used In This Guide Use this CSV file: - `plotly_india_district_sample.csv` Place it in the same folder as your notebook or script. If you keep it in a `data/` folder, load it like this: ```python df = pd.read_csv("data/plotly_india_district_sample.csv") ``` ## What You Will Build By the end, you will have: - a cleaned district metrics table - a reusable `plot_district_map()` function - an all-India interactive bubble map - a state-filtered map - a Streamlit dashboard with sidebar controls The final app will use Plotly Express and Streamlit: ```bash pip install pandas plotly streamlit ``` ## 1. Load The Dataset Start with Pandas and Plotly Express: ```python import pandas as pd import plotly.express as px ``` Load the CSV: ```python df = pd.read_csv("plotly_india_district_sample.csv") print(df.head()) print(df.shape) ``` Expected columns include: - `State` - `District` - `Latitude` - `Longitude` - `Population` - `Households` - `Households_with_Internet` - `Households_with_Computer` - `Housholds_with_Electric_Lighting` - `Workers` - `sex_ratio` - `literacy_rate` - `internet_household_pct` - `urban_household_pct` The column `Housholds_with_Electric_Lighting` keeps the same misspelling that often appears in raw public data extracts. In a real project, you may rename it. In this tutorial, we keep it visible so you learn how to handle imperfect source schemas. ## 2. Validate The Data Before plotting, check whether the map has valid coordinates and numeric metrics. ```python required_columns = [ "State", "District", "Latitude", "Longitude", "Population", "Households", "Households_with_Internet", "Households_with_Computer", "Housholds_with_Electric_Lighting", "Workers", "sex_ratio", "literacy_rate", "internet_household_pct", "urban_household_pct", ] missing_columns = [col for col in required_columns if col not in df.columns] print("Missing columns:", missing_columns) print(df[["Latitude", "Longitude"]].isna().sum()) print(df.duplicated(subset=["State", "District"]).sum()) ``` For this sample dataset, you should see: - no missing required columns - no missing coordinates - no duplicate state-district pairs ## 3. Create A Basic District Map Plotly can draw points on map tiles with `px.scatter_mapbox`. ```python fig = px.scatter_mapbox( df, lat="Latitude", lon="Longitude", hover_name="District", hover_data=["State", "Population", "literacy_rate"], zoom=3.8, height=650, mapbox_style="carto-positron", title="District Sample Map", ) fig.show() ``` The `carto-positron` style works without a Mapbox token, which makes it convenient for notebooks and small teaching apps. ## 4. Encode Population With Marker Size A dashboard becomes more useful when visual properties carry meaning. Here, marker size represents population: ```python fig = px.scatter_mapbox( df, lat="Latitude", lon="Longitude", size="Population", size_max=35, hover_name="District", hover_data=["State", "Population"], zoom=3.8, height=650, mapbox_style="carto-positron", title="Population By District", ) fig.show() ``` Use `size_max` to prevent the largest districts from covering the whole map. ## 5. Add Color For A Second Metric Now encode `literacy_rate` with color: ```python fig = px.scatter_mapbox( df, lat="Latitude", lon="Longitude", size="Population", color="literacy_rate", size_max=35, color_continuous_scale="Viridis", hover_name="District", hover_data={ "State": True, "Population": ":,", "literacy_rate": ":.1f", "Latitude": False, "Longitude": False, }, zoom=3.8, height=650, mapbox_style="carto-positron", title="Population Size And Literacy Rate Color", ) fig.show() ``` This creates a two-metric visualization: - larger bubbles mean larger population - brighter or darker colors show literacy differences ## 6. Filter To One State Dashboards usually need filters. Filter the data to one state: ```python state_name = "Maharashtra" state_df = df[df["State"] == state_name] fig = px.scatter_mapbox( state_df, lat="Latitude", lon="Longitude", size="Population", color="internet_household_pct", size_max=35, color_continuous_scale="Plasma", hover_name="District", hover_data=["State", "Population", "Households_with_Internet"], zoom=5.5, height=650, mapbox_style="carto-positron", title=f"Internet Access Sample Metrics In {state_name}", ) fig.show() ``` The same charting logic works for a national view and a state-level view. ## 7. Build A Reusable Plot Function Instead of rewriting the same Plotly call, create a function. ```python def plot_district_map(data, primary_metric, secondary_metric, title, zoom): fig = px.scatter_mapbox( data, lat="Latitude", lon="Longitude", size=primary_metric, color=secondary_metric, size_max=35, color_continuous_scale="Viridis", hover_name="District", hover_data={ "State": True, primary_metric: ":,", secondary_metric: ":.2f" if "pct" in secondary_metric or "rate" in secondary_metric else ":,", "Latitude": False, "Longitude": False, }, zoom=zoom, height=700, mapbox_style="carto-positron", title=title, ) fig.update_layout( margin={"r": 0, "t": 50, "l": 0, "b": 0}, coloraxis_colorbar_title=secondary_metric.replace("_", " ").title(), ) return fig ``` Test it: ```python fig = plot_district_map( df, primary_metric="Population", secondary_metric="literacy_rate", title="District Population And Literacy Rate", zoom=3.8, ) fig.show() ``` This function is the heart of the dashboard. ## 8. Choose Metrics Programmatically Create a list of numeric columns users can select. ```python protected_columns = {"Latitude", "Longitude"} numeric_metrics = [ col for col in df.select_dtypes(include="number").columns if col not in protected_columns ] print(numeric_metrics) ``` Example output: ```text ['Population', 'Male_Literate', 'Female_Literate', 'Households_with_Internet', ...] ``` This lets the dashboard remain flexible even if you add more metrics later. ## 9. Build The Streamlit App Create a file named `app.py`: ```python import pandas as pd import plotly.express as px import streamlit as st @st.cache_data def load_data(): return pd.read_csv("plotly_india_district_sample.csv") def plot_district_map(data, primary_metric, secondary_metric, title, zoom): fig = px.scatter_mapbox( data, lat="Latitude", lon="Longitude", size=primary_metric, color=secondary_metric, size_max=35, color_continuous_scale="Viridis", hover_name="District", hover_data={ "State": True, primary_metric: ":,", secondary_metric: ":.2f" if "pct" in secondary_metric or "rate" in secondary_metric else ":,", "Latitude": False, "Longitude": False, }, zoom=zoom, height=700, mapbox_style="carto-positron", title=title, ) fig.update_layout(margin={"r": 0, "t": 50, "l": 0, "b": 0}) return fig df = load_data() st.set_page_config(page_title="India District Metrics Dashboard", layout="wide") st.title("India District Metrics Dashboard") st.caption("Original practice dataset with approximate locations and synthetic indicators.") states = ["Overall India"] + sorted(df["State"].unique()) numeric_metrics = [ col for col in df.select_dtypes(include="number").columns if col not in {"Latitude", "Longitude"} ] selected_state = st.sidebar.selectbox("Select a state", states) primary_metric = st.sidebar.selectbox("Marker size metric", numeric_metrics, index=numeric_metrics.index("Population")) secondary_metric = st.sidebar.selectbox("Marker color metric", numeric_metrics, index=numeric_metrics.index("literacy_rate")) if selected_state == "Overall India": filtered = df zoom = 3.8 title = f"All Districts: Size = {primary_metric}, Color = {secondary_metric}" else: filtered = df[df["State"] == selected_state] zoom = 5.2 title = f"{selected_state}: Size = {primary_metric}, Color = {secondary_metric}" left, right = st.columns([3, 1]) with left: fig = plot_district_map(filtered, primary_metric, secondary_metric, title, zoom) st.plotly_chart(fig, use_container_width=True) with right: st.subheader("Selected Data") st.metric("Districts", len(filtered)) st.metric("Total Population", f"{int(filtered['Population'].sum()):,}") st.metric("Avg Literacy Rate", f"{filtered['literacy_rate'].mean():.1f}%") st.metric("Avg Internet Household %", f"{filtered['internet_household_pct'].mean():.1f}%") st.dataframe( filtered[["State", "District", primary_metric, secondary_metric]].sort_values(primary_metric, ascending=False), use_container_width=True, ) ``` Run the app: ```bash streamlit run app.py ``` ## 10. How The Dashboard Works The dashboard has three layers: 1. Data loading 2. User controls 3. Plot rendering The data loading layer uses `@st.cache_data` so Streamlit does not reread the CSV on every interaction. The sidebar controls change: - which rows are displayed - which column controls marker size - which column controls marker color The Plotly function receives those choices and returns a new figure. ## 11. Improve The Hover Tooltip Hover labels are where interactive charts become useful. Try adding more context: ```python fig = px.scatter_mapbox( filtered, lat="Latitude", lon="Longitude", size=primary_metric, color=secondary_metric, hover_name="District", hover_data={ "State": True, "Population": ":,", "Households": ":,", "literacy_rate": ":.1f", "internet_household_pct": ":.1f", "urban_household_pct": ":.1f", "Latitude": False, "Longitude": False, }, mapbox_style="carto-positron", ) ``` Hide coordinates unless they are analytically useful. Most users care more about the district name and metrics than raw latitude and longitude. ## 12. Add A Ranking Chart A map answers "where". A bar chart answers "who is highest or lowest". Add a ranking chart below the map: ```python top_districts = filtered.nlargest(10, secondary_metric) bar_fig = px.bar( top_districts, x=secondary_metric, y="District", color=secondary_metric, orientation="h", title=f"Top Districts By {secondary_metric}", ) bar_fig.update_layout(yaxis={"categoryorder": "total ascending"}) st.plotly_chart(bar_fig, use_container_width=True) ``` This turns the app from a map-only demo into a small analytical dashboard. ## 13. Common Problems ### The map does not display Check these items: - `Latitude` and `Longitude` are numeric - the values are not missing - the map style is token-free, such as `carto-positron` ### Markers are too large Lower `size_max`: ```python size_max=20 ``` **Explanation** - Defines a constant variable size_max with value 20 that likely represents an upper limit for data structures or processing operations - This variable can be used throughout the codebase to maintain consistent sizing constraints without hardcoding the number 20 in multiple locations - The naming convention suggests this is part of a configuration or parameter setup phase in a program - Commonly used in scenarios like array bounds checking, buffer size limitations, or iterative process termination conditions - This approach improves code maintainability by centralizing the maximum size value in one location ### The app reruns too often Streamlit reruns the script when a widget changes. Use `@st.cache_data` for loading data and keep expensive transformations inside cached functions. ### The chart has too many hover fields Use `hover_data` to hide columns: ```python hover_data={"Latitude": False, "Longitude": False} ``` **Explanation** - This code snippet defines a dictionary that controls which data fields appear when hovering over elements in Plotly visualizations - The keys "Latitude" and "Longitude" are set to False, meaning these coordinates won't be displayed during hover interactions - This approach helps reduce clutter in tooltips by hiding less essential coordinate information while keeping other data visible - Commonly used in dash applications to create cleaner, more focused user interfaces for geographic data visualization - The configuration can be passed to Plotly's hover_data parameter to customize interactive tooltip behavior ## 14. Practice Tasks Try these improvements: 1. Add a dropdown that switches the color scale between `Viridis`, `Plasma`, and `Turbo`. 2. Add a checkbox to show only districts above a selected literacy rate. 3. Add a slider for minimum population. 4. Create a second tab for ranking charts. 5. Add a download button for the filtered data. Example filter: ```python minimum_population = st.sidebar.slider( "Minimum population", min_value=int(df["Population"].min()), max_value=int(df["Population"].max()), value=int(df["Population"].min()), ) filtered = filtered[filtered["Population"] >= minimum_population] ``` **Explanation** - Creates an interactive slider widget in Streamlit sidebar to select minimum population threshold - Sets slider range from the minimum to maximum population values in the dataset - Initializes slider to the minimum population value by default - Filters the dataset to include only rows where population meets or exceeds the selected minimum threshold - Updates the filtered dataframe based on user selection for downstream analysis or visualization ## Final Takeaway This project teaches a practical visualization workflow: - clean and validate tabular data - map numeric columns to visual encodings - use hover labels for context - wrap Plotly logic in reusable functions - expose filters through Streamlit controls That pattern is reusable for many real dashboards: public datasets, operations monitoring, sales territories, service locations, logistics, and education analytics.