Creating a Dynamic Knowledge Graph from Reading History

Jun 15, 2026
14 min read

AI Insights

Powered by GPT-4o-mini

Verified Context: creating-a-dynamic-knowledge-graph-from-reading-history
Quick Answer

How D3.js renders a force-directed graph of reading history, how topics are extracted from completed posts, how edge weights are computed from co-occurrence, and the interactive exploration UI.

Quick Summary

Learn to build an interactive knowledge graph using D3.js and FastAPI, visualizing your reading history effectively.

Building a Knowledge Graph From Reading History

The Brain page includes a knowledge graph — a force-directed visualization of the topics a user has studied, connected by how often they appear together. It's rendered entirely on the frontend with D3.js.


Data Model on the Backend

The backend provides the graph data as a simple JSON structure:

python
@router.get("/gamification/knowledge-graph")
async def get_knowledge_graph(user: User = Depends(get_current_user)):
    completed_posts = await db.execute(
        select(Post.tags).join(
            UserProgress, UserProgress.post_id == Post.id
        ).where(
            UserProgress.user_id == user.id,
            UserProgress.completed_at.isnot(None),
        )
    )

    tag_cooccurrence: dict[tuple[str, str], int] = {}
    tag_counts: dict[str, int] = {}

    for (tags,) in completed_posts:
        tag_names = [t.name for t in tags]
        for tag in tag_names:
            tag_counts[tag] = tag_counts.get(tag, 0) + 1
        for i in range(len(tag_names)):
            for j in range(i + 1, len(tag_names)):
                pair = tuple(sorted([tag_names[i], tag_names[j]]))
                tag_cooccurrence[pair] = tag_cooccurrence.get(pair, 0) + 1

    nodes = [{"id": tag, "size": count} for tag, count in tag_counts.items()]
    edges = [
        {"source": a, "target": b, "weight": weight}
        for (a, b), weight in tag_cooccurrence.items()
    ]

    return {"nodes": nodes, "edges": edges}

The response contains:

  • Nodes — each tag the user has encountered, sized by how many posts they've read with that tag
  • Edges — connections between tags that appear together, weighted by co-occurrence frequency

D3.js Force-Directed Layout

The frontend renders the graph using D3.js:

tsx
import { useEffect, useRef } from "react";
import * as d3 from "d3";

interface GraphData {
    nodes: Array<{ id: string; size: number }>;
    edges: Array<{ source: string; target: string; weight: number }>;
}

function KnowledgeGraph({ data }: { data: GraphData }) {
    const svgRef = useRef<SVGSVGElement>(null);

    useEffect(() => {
        if (!svgRef.current || !data.nodes.length) return;

        const width = 800;
        const height = 600;
        const svg = d3.select(svgRef.current);

        svg.selectAll("*").remove();

        const simulation = d3.forceSimulation(data.nodes)
            .force("link", d3.forceLink(data.edges).id((d: any) => d.id).distance(100))
            .force("charge", d3.forceManyBody().strength(-200))
            .force("center", d3.forceCenter(width / 2, height / 2))
            .force("collision", d3.forceCollide().radius((d: any) => d.size * 3 + 10));

        const link = svg.append("g")
            .selectAll("line")
            .data(data.edges)
            .join("line")
            .attr("stroke", "rgba(255,255,255,0.06)")
            .attr("stroke-width", (d) => Math.sqrt(d.weight));

        const node = svg.append("g")
            .selectAll("circle")
            .data(data.nodes)
            .join("circle")
            .attr("r", (d) => d.size * 3 + 8)
            .attr("fill", "rgba(251, 191, 36, 0.15)")
            .attr("stroke", "rgba(251, 191, 36, 0.4)")
            .attr("stroke-width", 1.5)
            .call(drag(simulation));

        const label = svg.append("g")
            .selectAll("text")
            .data(data.nodes)
            .join("text")
            .text((d) => d.id)
            .attr("font-size", 10)
            .attr("font-weight", "bold")
            .attr("fill", "rgba(255,255,255,0.7)")
            .attr("text-anchor", "middle")
            .attr("dy", 4);

        simulation.on("tick", () => {
            link.attr("x1", (d: any) => d.source.x)
                .attr("y1", (d: any) => d.source.y)
                .attr("x2", (d: any) => d.target.x)
                .attr("y2", (d: any) => d.target.y);

            node.attr("cx", (d: any) => d.x).attr("cy", (d: any) => d.y);
            label.attr("x", (d: any) => d.x).attr("y", (d: any) => d.y);
        });
    }, [data]);

    return (
        <div className="card p-4">
            <svg ref={svgRef} viewBox="0 0 800 600" className="w-full h-auto" />
        </div>
    );
}

The simulation parameters:

  • forceLink — connects nodes with edges, distance proportional to weight
  • forceManyBody — repels nodes from each other (prevents overlap)
  • forceCenter — pulls the graph toward the center of the SVG
  • forceCollide — prevents node circles from overlapping

Interactive Features

The graph supports drag, zoom, and click:

typescript
function drag(simulation: d3.Simulation<any, any>) {
    return d3.drag()
        .on("start", (event, d: any) => {
            if (!event.active) simulation.alphaTarget(0.3).restart();
            d.fx = d.x;
            d.fy = d.y;
        })
        .on("drag", (event, d: any) => {
            d.fx = event.x;
            d.fy = event.y;
        })
        .on("end", (event, d: any) => {
            if (!event.active) simulation.alphaTarget(0);
            d.fx = null;
            d.fy = null;
        });
}

// Zoom handler
svg.call(d3.zoom()
    .scaleExtent([0.5, 3])
    .on("zoom", (event) => {
        container.attr("transform", event.transform);
    })
);

// Click handler
node.on("click", (event, d) => {
    router.push(`/tags/${d.id.toLowerCase()}`);
});

Node Sizing Logic

Node size is determined by the number of completed posts with that tag:

typescript
const sizeScale = d3.scaleSqrt()
    .domain([1, d3.max(data.nodes, (d) => d.size)])
    .range([8, 28]);

// Usage:
.attr("r", (d) => sizeScale(d.size))

The scale is square-root (not linear) to prevent a single dominant tag from making other nodes invisible. A user who read 10 Python posts and 2 FastAPI posts would see a Python node about 2.2x larger than FastAPI, not 5x larger.


Color Coding by Category

Tags are grouped into categories by prefix matching:

typescript
function getCategory(tagName: string): string {
    if (["python", "fastapi", "nextjs", "react"].includes(tagName)) return "language";
    if (["ai", "rag", "machine-learning", "data-science"].includes(tagName)) return "ai";
    if (["architecture", "devops", "production", "docker"].includes(tagName)) return "infra";
    return "general";
}

const categoryColors: Record<string, string> = {
    language: "rgba(52, 211, 153, 0.4)",   // green
    ai:       "rgba(251, 191, 36, 0.4)",    // amber
    infra:    "rgba(96, 165, 250, 0.4)",    // blue
    general:  "rgba(255, 255, 255, 0.15)",  // white
};

Empty State

When the user hasn't read any posts, the graph shows a prompt:

tsx
if (!data.nodes.length) {
    return (
        <div className="card p-12 text-center">
            <Brain size={48} className="mx-auto mb-4 text-white/20" />
            <h3 className="font-bold text-lg text-white/60 mb-2">
                Your Knowledge Graph is Empty
            </h3>
            <p className="text-sm text-white/40 max-w-md mx-auto">
                Start reading posts to build your personalized knowledge graph.
                Topics you study will appear here as connected nodes.
            </p>
            <Button className="mt-6" onClick={() => router.push("/posts")}>
                Browse Posts
            </Button>
        </div>
    );
}

Performance

With 50+ nodes and 200+ edges, D3.js runs at 60fps on desktop and 30fps on mobile. The bottleneck is SVG rendering, not the simulation. For larger graphs, canvas-based rendering would be faster, but the current SVG approach is simpler and sufficient for the typical user's data.


What's Next

The next post covers the Docker stack — how docker-compose.yml is structured, Nginx configuration, Redis caching strategy, PostgreSQL tuning, Cloudflare proxying, and the deployment pipeline.


Built with D3.js, FastAPI, PostgreSQL, and zero third-party visualization libraries.

Frequently Asked Questions

What is the purpose of the knowledge graph in the Brain page?
The knowledge graph is a force-directed visualization of the topics a user has studied, connected by how often they appear together.
How is the graph data structured on the backend?
The backend provides the graph data as a simple JSON structure containing nodes and edges, where nodes represent tags and edges represent connections between tags weighted by co-occurrence frequency.
What library is used to render the knowledge graph on the frontend?
The knowledge graph is rendered entirely on the frontend using D3.js.
What information do nodes and edges contain in the knowledge graph?
Nodes represent each tag the user has encountered, sized by how many posts they've read with that tag, while edges represent connections between tags that appear together, weighted by co-occurrence frequency.
How does the D3.js force-directed layout work in rendering the graph?
The D3.js force-directed layout uses a simulation with forces like link, charge, center, and collision to position nodes and edges dynamically based on their properties and relationships.

Related Work

See how this thinking shows up in shipped systems.