# Creating a Dynamic Knowledge Graph from Reading History URL: https://madhudadi.in/blog/posts/creating-a-dynamic-knowledge-graph-from-reading-history Published: 2026-06-15 Tags: Architecture, Next.js, Production, python Read time: 14 min Difficulty: advanced > How D3.js renders a force-directed graph of reading history, how topics are extracted from completed posts, how edge weights are computed from co-occurrence, and the interactive exploration UI.# Building a Knowledge Graph From Reading History The Brain page includes a knowledge graph — a force-directed visualization of the topics a user has studied, connected by how often they appear together. It's rendered entirely on the frontend with D3.js. --- ## Data Model on the Backend The backend provides the graph data as a simple JSON structure: ```python @router.get("/gamification/knowledge-graph") async def get_knowledge_graph(user: User = Depends(get_current_user)): completed_posts = await db.execute( select(Post.tags).join( UserProgress, UserProgress.post_id == Post.id ).where( UserProgress.user_id == user.id, UserProgress.completed_at.isnot(None), ) ) tag_cooccurrence: dict[tuple[str, str], int] = {} tag_counts: dict[str, int] = {} for (tags,) in completed_posts: tag_names = [t.name for t in tags] for tag in tag_names: tag_counts[tag] = tag_counts.get(tag, 0) + 1 for i in range(len(tag_names)): for j in range(i + 1, len(tag_names)): pair = tuple(sorted([tag_names[i], tag_names[j]])) tag_cooccurrence[pair] = tag_cooccurrence.get(pair, 0) + 1 nodes = [{"id": tag, "size": count} for tag, count in tag_counts.items()] edges = [ {"source": a, "target": b, "weight": weight} for (a, b), weight in tag_cooccurrence.items() ] return {"nodes": nodes, "edges": edges} ``` The response contains: - **Nodes** — each tag the user has encountered, sized by how many posts they've read with that tag - **Edges** — connections between tags that appear together, weighted by co-occurrence frequency --- ## D3.js Force-Directed Layout The frontend renders the graph using D3.js: ```tsx import { useEffect, useRef } from "react"; import * as d3 from "d3"; interface GraphData { nodes: Array<{ id: string; size: number }>; edges: Array<{ source: string; target: string; weight: number }>; } function KnowledgeGraph({ data }: { data: GraphData }) { const svgRef = useRef(null); useEffect(() => { if (!svgRef.current || !data.nodes.length) return; const width = 800; const height = 600; const svg = d3.select(svgRef.current); svg.selectAll("*").remove(); const simulation = d3.forceSimulation(data.nodes) .force("link", d3.forceLink(data.edges).id((d: any) => d.id).distance(100)) .force("charge", d3.forceManyBody().strength(-200)) .force("center", d3.forceCenter(width / 2, height / 2)) .force("collision", d3.forceCollide().radius((d: any) => d.size * 3 + 10)); const link = svg.append("g") .selectAll("line") .data(data.edges) .join("line") .attr("stroke", "rgba(255,255,255,0.06)") .attr("stroke-width", (d) => Math.sqrt(d.weight)); const node = svg.append("g") .selectAll("circle") .data(data.nodes) .join("circle") .attr("r", (d) => d.size * 3 + 8) .attr("fill", "rgba(251, 191, 36, 0.15)") .attr("stroke", "rgba(251, 191, 36, 0.4)") .attr("stroke-width", 1.5) .call(drag(simulation)); const label = svg.append("g") .selectAll("text") .data(data.nodes) .join("text") .text((d) => d.id) .attr("font-size", 10) .attr("font-weight", "bold") .attr("fill", "rgba(255,255,255,0.7)") .attr("text-anchor", "middle") .attr("dy", 4); simulation.on("tick", () => { link.attr("x1", (d: any) => d.source.x) .attr("y1", (d: any) => d.source.y) .attr("x2", (d: any) => d.target.x) .attr("y2", (d: any) => d.target.y); node.attr("cx", (d: any) => d.x).attr("cy", (d: any) => d.y); label.attr("x", (d: any) => d.x).attr("y", (d: any) => d.y); }); }, [data]); return (
); } ``` The simulation parameters: - `forceLink` — connects nodes with edges, distance proportional to weight - `forceManyBody` — repels nodes from each other (prevents overlap) - `forceCenter` — pulls the graph toward the center of the SVG - `forceCollide` — prevents node circles from overlapping --- ## Interactive Features The graph supports drag, zoom, and click: ```typescript function drag(simulation: d3.Simulation) { return d3.drag() .on("start", (event, d: any) => { if (!event.active) simulation.alphaTarget(0.3).restart(); d.fx = d.x; d.fy = d.y; }) .on("drag", (event, d: any) => { d.fx = event.x; d.fy = event.y; }) .on("end", (event, d: any) => { if (!event.active) simulation.alphaTarget(0); d.fx = null; d.fy = null; }); } // Zoom handler svg.call(d3.zoom() .scaleExtent([0.5, 3]) .on("zoom", (event) => { container.attr("transform", event.transform); }) ); // Click handler node.on("click", (event, d) => { router.push(`/tags/${d.id.toLowerCase()}`); }); ``` --- ## Node Sizing Logic Node size is determined by the number of completed posts with that tag: ```typescript const sizeScale = d3.scaleSqrt() .domain([1, d3.max(data.nodes, (d) => d.size)]) .range([8, 28]); // Usage: .attr("r", (d) => sizeScale(d.size)) ``` The scale is square-root (not linear) to prevent a single dominant tag from making other nodes invisible. A user who read 10 Python posts and 2 FastAPI posts would see a Python node about 2.2x larger than FastAPI, not 5x larger. --- ## Color Coding by Category Tags are grouped into categories by prefix matching: ```typescript function getCategory(tagName: string): string { if (["python", "fastapi", "nextjs", "react"].includes(tagName)) return "language"; if (["ai", "rag", "machine-learning", "data-science"].includes(tagName)) return "ai"; if (["architecture", "devops", "production", "docker"].includes(tagName)) return "infra"; return "general"; } const categoryColors: Record = { language: "rgba(52, 211, 153, 0.4)", // green ai: "rgba(251, 191, 36, 0.4)", // amber infra: "rgba(96, 165, 250, 0.4)", // blue general: "rgba(255, 255, 255, 0.15)", // white }; ``` --- ## Empty State When the user hasn't read any posts, the graph shows a prompt: ```tsx if (!data.nodes.length) { return (

Your Knowledge Graph is Empty

Start reading posts to build your personalized knowledge graph. Topics you study will appear here as connected nodes.

); } ``` --- ## Performance With 50+ nodes and 200+ edges, D3.js runs at 60fps on desktop and 30fps on mobile. The bottleneck is SVG rendering, not the simulation. For larger graphs, canvas-based rendering would be faster, but the current SVG approach is simpler and sufficient for the typical user's data. --- ## What's Next The next post covers the Docker stack — how docker-compose.yml is structured, Nginx configuration, Redis caching strategy, PostgreSQL tuning, Cloudflare proxying, and the deployment pipeline. --- *Built with D3.js, FastAPI, PostgreSQL, and zero third-party visualization libraries.*