Building a Knowledge Graph From Reading History
The Brain page includes a knowledge graph — a force-directed visualization of the topics a user has studied, connected by how often they appear together. It's rendered entirely on the frontend with D3.js.
Data Model on the Backend
The backend provides the graph data as a simple JSON structure:
@router.get("/gamification/knowledge-graph")
async def get_knowledge_graph(user: User = Depends(get_current_user)):
completed_posts = await db.execute(
select(Post.tags).join(
UserProgress, UserProgress.post_id == Post.id
).where(
UserProgress.user_id == user.id,
UserProgress.completed_at.isnot(None),
)
)
tag_cooccurrence: dict[tuple[str, str], int] = {}
tag_counts: dict[str, int] = {}
for (tags,) in completed_posts:
tag_names = [t.name for t in tags]
for tag in tag_names:
tag_counts[tag] = tag_counts.get(tag, 0) + 1
for i in range(len(tag_names)):
for j in range(i + 1, len(tag_names)):
pair = tuple(sorted([tag_names[i], tag_names[j]]))
tag_cooccurrence[pair] = tag_cooccurrence.get(pair, 0) + 1
nodes = [{"id": tag, "size": count} for tag, count in tag_counts.items()]
edges = [
{"source": a, "target": b, "weight": weight}
for (a, b), weight in tag_cooccurrence.items()
]
return {"nodes": nodes, "edges": edges}The response contains:
- Nodes — each tag the user has encountered, sized by how many posts they've read with that tag
- Edges — connections between tags that appear together, weighted by co-occurrence frequency
D3.js Force-Directed Layout
The frontend renders the graph using D3.js:
import { useEffect, useRef } from "react";
import * as d3 from "d3";
interface GraphData {
nodes: Array<{ id: string; size: number }>;
edges: Array<{ source: string; target: string; weight: number }>;
}
function KnowledgeGraph({ data }: { data: GraphData }) {
const svgRef = useRef<SVGSVGElement>(null);
useEffect(() => {
if (!svgRef.current || !data.nodes.length) return;
const width = 800;
const height = 600;
const svg = d3.select(svgRef.current);
svg.selectAll("*").remove();
const simulation = d3.forceSimulation(data.nodes)
.force("link", d3.forceLink(data.edges).id((d: any) => d.id).distance(100))
.force("charge", d3.forceManyBody().strength(-200))
.force("center", d3.forceCenter(width / 2, height / 2))
.force("collision", d3.forceCollide().radius((d: any) => d.size * 3 + 10));
const link = svg.append("g")
.selectAll("line")
.data(data.edges)
.join("line")
.attr("stroke", "rgba(255,255,255,0.06)")
.attr("stroke-width", (d) => Math.sqrt(d.weight));
const node = svg.append("g")
.selectAll("circle")
.data(data.nodes)
.join("circle")
.attr("r", (d) => d.size * 3 + 8)
.attr("fill", "rgba(251, 191, 36, 0.15)")
.attr("stroke", "rgba(251, 191, 36, 0.4)")
.attr("stroke-width", 1.5)
.call(drag(simulation));
const label = svg.append("g")
.selectAll("text")
.data(data.nodes)
.join("text")
.text((d) => d.id)
.attr("font-size", 10)
.attr("font-weight", "bold")
.attr("fill", "rgba(255,255,255,0.7)")
.attr("text-anchor", "middle")
.attr("dy", 4);
simulation.on("tick", () => {
link.attr("x1", (d: any) => d.source.x)
.attr("y1", (d: any) => d.source.y)
.attr("x2", (d: any) => d.target.x)
.attr("y2", (d: any) => d.target.y);
node.attr("cx", (d: any) => d.x).attr("cy", (d: any) => d.y);
label.attr("x", (d: any) => d.x).attr("y", (d: any) => d.y);
});
}, [data]);
return (
<div className="card p-4">
<svg ref={svgRef} viewBox="0 0 800 600" className="w-full h-auto" />
</div>
);
}The simulation parameters:
forceLink— connects nodes with edges, distance proportional to weightforceManyBody— repels nodes from each other (prevents overlap)forceCenter— pulls the graph toward the center of the SVGforceCollide— prevents node circles from overlapping
Interactive Features
The graph supports drag, zoom, and click:
function drag(simulation: d3.Simulation<any, any>) {
return d3.drag()
.on("start", (event, d: any) => {
if (!event.active) simulation.alphaTarget(0.3).restart();
d.fx = d.x;
d.fy = d.y;
})
.on("drag", (event, d: any) => {
d.fx = event.x;
d.fy = event.y;
})
.on("end", (event, d: any) => {
if (!event.active) simulation.alphaTarget(0);
d.fx = null;
d.fy = null;
});
}
// Zoom handler
svg.call(d3.zoom()
.scaleExtent([0.5, 3])
.on("zoom", (event) => {
container.attr("transform", event.transform);
})
);
// Click handler
node.on("click", (event, d) => {
router.push(`/tags/${d.id.toLowerCase()}`);
});Node Sizing Logic
Node size is determined by the number of completed posts with that tag:
const sizeScale = d3.scaleSqrt()
.domain([1, d3.max(data.nodes, (d) => d.size)])
.range([8, 28]);
// Usage:
.attr("r", (d) => sizeScale(d.size))The scale is square-root (not linear) to prevent a single dominant tag from making other nodes invisible. A user who read 10 Python posts and 2 FastAPI posts would see a Python node about 2.2x larger than FastAPI, not 5x larger.
Color Coding by Category
Tags are grouped into categories by prefix matching:
function getCategory(tagName: string): string {
if (["python", "fastapi", "nextjs", "react"].includes(tagName)) return "language";
if (["ai", "rag", "machine-learning", "data-science"].includes(tagName)) return "ai";
if (["architecture", "devops", "production", "docker"].includes(tagName)) return "infra";
return "general";
}
const categoryColors: Record<string, string> = {
language: "rgba(52, 211, 153, 0.4)", // green
ai: "rgba(251, 191, 36, 0.4)", // amber
infra: "rgba(96, 165, 250, 0.4)", // blue
general: "rgba(255, 255, 255, 0.15)", // white
};Empty State
When the user hasn't read any posts, the graph shows a prompt:
if (!data.nodes.length) {
return (
<div className="card p-12 text-center">
<Brain size={48} className="mx-auto mb-4 text-white/20" />
<h3 className="font-bold text-lg text-white/60 mb-2">
Your Knowledge Graph is Empty
</h3>
<p className="text-sm text-white/40 max-w-md mx-auto">
Start reading posts to build your personalized knowledge graph.
Topics you study will appear here as connected nodes.
</p>
<Button className="mt-6" onClick={() => router.push("/posts")}>
Browse Posts
</Button>
</div>
);
}Performance
With 50+ nodes and 200+ edges, D3.js runs at 60fps on desktop and 30fps on mobile. The bottleneck is SVG rendering, not the simulation. For larger graphs, canvas-based rendering would be faster, but the current SVG approach is simpler and sufficient for the typical user's data.
What's Next
The next post covers the Docker stack — how docker-compose.yml is structured, Nginx configuration, Redis caching strategy, PostgreSQL tuning, Cloudflare proxying, and the deployment pipeline.
Built with D3.js, FastAPI, PostgreSQL, and zero third-party visualization libraries.
