# The Monorepo That Runs 29 Services on a Single $24 VPS
URL: https://madhudadi.in/blog/posts/the-monorepo-that-runs-29-services-full-architecture-breakdown
Published: 2026-05-11
Tags: Architecture, FastAPI, Next.js, Production
Read time: 18 min
Difficulty: intermediate
> One codebase, two runtimes, 29 API routers, 21 database models, and a background scheduler. Here's how the monorepo is structured, why every directory exists, and how the pieces fit together without a third-party CMS.# The Monorepo That Runs 29 Services

> "Controlling complexity is the essence of computer programming." — Brian Kernighan, co-author of *The C Programming Language*

In the last post, I explained why I built this platform from scratch. Now let's open the hood.

This monorepo contains two applications — a FastAPI backend and a Next.js frontend — plus shared infrastructure configuration. Everything lives in one repository, deploys via one `docker-compose.yml`, and runs on one $12/month VPS.

Here's the directory structure, explained layer by layer.

---

## Top-Level Layout

```
blog_platform/
├── fastapi_backend/     # Python API server (29 routers, 21 models)
├── blog_frontend/       # Next.js 16 app (React 19, Turbopack)
├── nginx/               # Reverse proxy config
├── docs/                # Architecture decisions, content plans
├── conductor/           # Feature specs and bug fixes
├── docker-compose.yml   # Single-file deployment
├── docker-compose.prod.yml
└── deploy.sh            # Zero-downtime deploy script
```

The two applications communicate exclusively through HTTP. The frontend never imports Python code, and the backend never references React components. The contract is the API schema — documented automatically by FastAPI's OpenAPI generation at `/docs`.

---

## FastAPI Backend — `fastapi_backend/`

```
fastapi_backend/
├── app/
│   ├── main.py              # App entry, middleware, router registration
│   ├── config.py            # Settings from environment variables
│   ├── database.py          # Async SQLAlchemy engine + session factory
│   ├── dependencies.py      # Shared dependency injection (auth, db)
│   ├── core/                # Cross-cutting concerns
│   │   ├── limiter.py       # Rate limiting (slowapi + Redis)
│   │   ├── redis.py         # Redis connection pool
│   │   ├── scheduler.py     # APScheduler background jobs
│   │   ├── uploads.py       # File upload handling
│   │   └── exceptions.py    # Custom exception classes
│   ├── models/              # SQLAlchemy ORM models (21 files)
│   ├── schemas/             # Pydantic request/response schemas
│   ├── routers/             # API endpoint handlers (29 files)
│   └── services/            # Business logic layer
├── alembic/                 # Database migrations
├── tests/                   # Pytest test suite
├── uploads/                 # User-uploaded images, PDFs, data files
├── scripts/                 # Maintenance scripts
├── requirements.txt
├── Pipfile
└── Dockerfile
```

### Entry Point — `main.py`

The application is assembled in `main.py`. Here's the skeleton:

```python
app = FastAPI(title="Madhu Dadi — AI, Python & Analytics Hub API", lifespan=lifespan)

app.add_middleware(TrustedHostMiddleware, allowed_hosts=settings.ALLOWED_HOSTS)
app.add_middleware(CORSMiddleware, ...)
app.add_middleware(SessionMiddleware, ...)

V1 = "/api/v1"
app.include_router(auth.router,        prefix=V1)
app.include_router(posts.router,       prefix=V1)
app.include_router(series.router,      prefix=V1)
app.include_router(comments.router,    prefix=V1)
# ... 25 more routers
```
**Explanation**

- Initializes a FastAPI application with a specified title and lifespan management.
- Adds middleware for security and session management, including TrustedHostMiddleware and CORS support.
- Defines a versioned API prefix (`/api/v1`) for organizing routes.
- Includes multiple routers for handling different functionalities such as authentication, posts, series, and comments, enhancing modularity.
- Prepares the application to scale by allowing the addition of more routers as needed.


The `lifespan` context manager initializes Redis and starts the background scheduler on startup, then tears them down on shutdown.

### Settings — `config.py`

All configuration comes from environment variables via Pydantic's `BaseSettings`:

```python
class Settings(BaseSettings):
    APP_NAME: str = "Madhu Dadi API"
    DEBUG: bool = False
    DATABASE_URL: str
    REDIS_URL: str
    SECRET_KEY: str
    CORS_ORIGINS: list[str]
    ALLOWED_HOSTS: list[str]
    # ... 30+ more settings
```
**Explanation**

- This code defines a `Settings` class that inherits from `BaseSettings`, which is part of the Pydantic library for data validation and settings management.
- The class includes several attributes such as `APP_NAME`, `DEBUG`, `DATABASE_URL`, `REDIS_URL`, `SECRET_KEY`, `CORS_ORIGINS`, and `ALLOWED_HOSTS`, which represent various configuration options for an application.
- Each attribute is type-annotated, ensuring that the values assigned to them conform to the specified types (e.g., `str`, `bool`, `list[str]`).
- The `DEBUG` attribute is set to `False` by default, indicating that the application is in production mode unless explicitly changed.
- The class is designed to easily manage and validate application settings, potentially loading values from environment variables or configuration files.


No hardcoded secrets. No `.env` files committed to git. Every deployment environment (dev, staging, production) supplies its own values through environment variables or Docker secrets.

### Database — `database.py`

Async SQLAlchemy 2.0 with session-per-request pattern:

```python
engine = create_async_engine(settings.DATABASE_URL, pool_size=20, max_overflow=10)
AsyncSessionLocal = sessionmaker(engine, class_=AsyncSession, expire_on_commit=False)

async def get_db() -> AsyncGenerator[AsyncSession, None]:
    async with AsyncSessionLocal() as db:
        yield db
```
**Explanation**

- Creates an asynchronous engine for database connections using a specified URL and connection pool settings.  
- Defines a session factory `AsyncSessionLocal` that produces instances of `AsyncSession`, allowing for asynchronous database operations.  
- Implements an asynchronous generator function `get_db` that yields a database session for use in asynchronous contexts.  
- Ensures that the database session is properly managed and closed after use by utilizing an asynchronous context manager.


The `expire_on_commit=False` is intentional — it prevents lazy-loading issues after commit, which is a common pitfall in async SQLAlchemy.

---

## The 29 Routers

Every API endpoint lives in `app/routers/`. Here's what each one does:

| Router | Endpoints | Purpose |
|--------|-----------|---------|
| `auth.py` | 6 | Register, login, Google OAuth, token refresh, logout, email verification |
| `posts.py` | 8 | CRUD posts, list with filters, get by slug, toggle publish |
| `series.py` | 5 | CRUD series, list, get with progress, next/prev navigation |
| `comments.py` | 4 | Create, list (tree), delete (own), admin delete |
| `tags.py` | 5 | CRUD tags, list, merge duplicates |
| `bookmarks.py` | 4 | Add, remove, list user bookmarks, check status |
| `progress.py` | 4 | Mark read, get user progress, series progress, completion stats |
| `search.py` | 2 | Full-text search, hybrid vector search |
| `admin.py` | 15 | Post management, user management, analytics, tasks |
| `gamification.py` | 6 | XP leaderboard, badges, milestones, streak, level-up |
| `rag.py` | 2 | Ask AI (RAG query), get related chunks |
| `code.py` | 1 | Execute Python code (Pyodide sandbox) |
| `payments.py` | 4 | Stripe checkout, webhook, subscription status, plans |
| `referral.py` | 3 | Create referral code, track, leaderboard |
| `srs.py` | 4 | Spaced repetition review queue, submit review, stats |
| `quiz.py` | 4 | Generate quiz, submit answers, get history, leaderboard |
| `challenge.py` | 3 | Daily challenge, submit, leaderboard |
| `interview.py` | 3 | Start interview, answer question, get feedback |
| `notifications.py` | 3 | List, mark read, dismiss |
| `digest.py` | 2 | Email digest subscribe, unsubscribe |
| `newsletter.py` | 3 | Subscribe, confirm, unsubscribe |
| `uploads.py` | 3 | Upload image, upload PDF, upload data file |
| `redirects.py` | 3 | Create, list, resolve |
| `feed.py` | 1 | RSS/Atom feed generation |
| `recommendations.py` | 1 | Personalized post recommendations |
| `certificate.py` | 2 | Generate series completion certificate, verify |
| `study_notes.py` | 3 | Create, list, delete personal study notes |
| `settings.py` | 2 | Get/update user settings |

That's 29 routers serving approximately 120 individual endpoints. Each router is between 50 and 300 lines. The `admin.py` router is the largest at ~500 lines because it handles post CRUD with all the tag/series/difficulty associations.

---

## The 21 Database Models

All models inherit from SQLAlchemy's `DeclarativeBase` and live in `app/models/`:

```python
class Base(DeclarativeBase):
    pass
```
**Explanation**

- This code snippet creates a base class named `Base` using SQLAlchemy's `DeclarativeBase`, which is essential for defining ORM models.  
- The `Base` class serves as a foundation for all ORM-mapped classes, allowing them to inherit common functionality.  
- By using this base class, developers can easily create database tables and map them to Python classes.  
- This approach simplifies the process of managing database schemas and relationships in a Python application.


Key models and their relationships:

| Model | Key Fields | Relationships |
|-------|-----------|---------------|
| `User` | email, password_hash, xp, level, streak | has_many: posts, comments, progress, bookmarks |
| `Post` | title, slug, content, status, difficulty | belongs_to: series; has_many: tags (M2M), comments, bookmarks |
| `Series` | title, slug, description | has_many: posts |
| `Tag` | name, slug | has_many: posts (M2M) |
| `Comment` | content, is_approved | belongs_to: user, post; self-referential: parent |
| `Bookmark` | — | belongs_to: user, post (unique constraint) |
| `UserProgress` | completed_at, read_time_spent | belongs_to: user, post |
| `Badge` | name, description, icon, criteria | has_many: users (M2M via UserBadge) |
| `Challenge` | day, question, answer, difficulty | has_many: submissions |
| `Payment` | stripe_session_id, status, amount | belongs_to: user |
| `Subscription` | stripe_subscription_id, status, plan | belongs_to: user |
| `RagChunk` | content, embedding (vector), metadata | belongs_to: post |
| `SrsCard` | ease_factor, interval, review_count, next_review | belongs_to: user, post |
| `Notification` | type, title, message, is_read | belongs_to: user |
| `Redirect` | old_slug, new_slug | — |
| `Referral` | code, reward_xp | belongs_to: user; has_many: referred users |
| `PostView` | viewed_at, ip_address | belongs_to: post (analytics) |
| `PostReaction` | reaction_type | belongs_to: user, post |
| `QuizAttempt` | score, total_questions, answers | belongs_to: user |
| `InterviewSession` | questions, answers, overall_score | belongs_to: user |

The most interesting table is `RagChunk`. It stores post content split into chunks, each with a 1536-dimensional vector embedding. The search query is: find chunks whose embedding is closest to the query embedding, filtered by the user's premium tier. This is the core of the "Ask AI" feature.

---

## Redis as the Glue Layer

Redis isn't a cache in this architecture — it's a service bus. It handles five distinct concerns:

**1. Rate limiting —** `slowapi` uses Redis as its backing store. Each endpoint family has its own limit (100/hr for anonymous, 500/hr for authenticated). The key is `rate_limit:{ip}:{route_group}` with a sliding window counter.

**2. OAuth state —** Google OAuth uses a redirect-based flow. The state parameter (a random token) is stored in Redis with a 10-minute TTL. After the callback, the token is verified and deleted. This prevents CSRF on the OAuth handshake.

**3. Embedding cache —** When a user asks the RAG system a question, the query is first checked against a Redis set of recent queries. If found within 5 minutes, the cached embedding is reused. This saves ~200ms per query on repeated questions.

**4. Task queue —** Redis pub/sub dispatches background tasks: email digests, content revalidation (purging Cloudflare cache), and maintenance jobs. The publisher pushes to a channel, and the APScheduler subscriber picks it up.

**5. Leaderboard —** XP rankings use Redis sorted sets (`ZADD`, `ZREVRANK`, `ZRANGE`). The leaderboard is recomputed every 5 seconds from a materialized PostgreSQL view, then stored in Redis for fast reads. This avoids sorting 10,000+ users on every page load.

---

## Background Scheduler

`app/core/scheduler.py` runs four recurring jobs:

```python
def start_scheduler():
    scheduler = AsyncIOScheduler()
    scheduler.add_job(send_daily_digests, "cron", hour=8, minute=0)
    scheduler.add_job(regenerate_sitemap, "cron", hour=2, minute=0)
    scheduler.add_job(clean_expired_tokens, "cron", hour=3, minute=0)
    scheduler.add_job(check_stripe_subscriptions, "interval", hours=1)
    scheduler.start()
```
**Explanation**

- Defines a function `start_scheduler` that sets up an asynchronous job scheduler using `AsyncIOScheduler`.  
- Adds a job to send daily digests at 8:00 AM using a cron schedule.  
- Schedules a job to regenerate the sitemap at 2:00 AM, also using a cron schedule.  
- Sets up a job to clean expired tokens every hour on the hour with an interval schedule.  
- Starts the scheduler to begin executing the scheduled jobs.


- **Daily digests** — queries the `Post` table for posts published in the last 24 hours, assembles an HTML email, and sends via SMTP to subscribed users.
- **Sitemap regeneration** — queries all published posts and series, generates a fresh `sitemap.xml`, and pings Google/Bing.
- **Token cleanup** — deletes expired refresh tokens from the database.
- **Stripe sync** — checks for subscriptions that should have expired and marks them accordingly.

The scheduler runs inside the same Python process as the FastAPI app. No separate Celery worker needed.

---

## Frontend — `blog_frontend/`

```
blog_frontend/
├── src/
│   ├── app/                  # Next.js App Router pages
│   │   ├── blog/             # Blog posts (dynamic routes)
│   │   ├── admin/            # Admin dashboard (protected)
│   │   ├── login/            # Auth pages
│   │   ├── register/
│   │   ├── profile/          # User profiles, settings
│   │   ├── series/           # Series index + detail
│   │   ├── tags/             # Tag index + filtered posts
│   │   ├── search/           # Full-text + vector search UI
│   │   ├── ask/              # RAG chat interface
│   │   ├── challenge/        # Daily coding challenge
│   │   ├── leaderboard/      # XP rankings
│   │   ├── milestones/       # Badges, progress, knowledge graph
│   │   ├── bookmarks/        # Saved posts
│   │   ├── layout.tsx        # Root layout with metadata defaults
│   │   ├── robots.ts         # Dynamic robots.txt
│   │   └── sitemap.ts        # Dynamic sitemap.xml
│   ├── components/           # Reusable React components
│   │   ├── admin/            # PostEditor, AnalyticsChart, etc.
│   │   ├── blog/             # MarkdownRenderer, PostCard, etc.
│   │   ├── layout/           # Navbar, Footer, CommandPalette
│   │   ├── ui/               # Button, Input, Spinner, GlassCard
│   │   ├── user/             # BadgeGrid, UserStats, KnowledgeGraph
│   │   ├── premium/          # PremiumGate, PlanSelectionModal
│   │   └── rag/              # RagChat overlay
│   ├── contexts/             # AuthContext, ThemeContext, LearningContext
│   ├── lib/                  # API client, utilities, types
│   └── workers/              # Web Workers (Python runner)
├── public/                   # Static assets
├── e2e/                      # Playwright tests
├── next.config.ts
├── tailwind.config.ts
├── vitest.config.ts
└── postcss.config.js
```

### The API Client — `src/lib/api.ts`

The frontend communicates with the backend through a typed API client. Every endpoint is a function:

```typescript
export const postsApi = {
  get: (slug: string, token?: string) => 
    apiFetch<PostResponse>(`/posts/${slug}`, { auth: true, token }),
  list: (params?: PostListParams) => 
    apiFetch<PaginatedResponse<PostListItem>>(`/posts?${toQuery(params)}`),
  create: (payload: CreatePostPayload) =>
    apiFetch<PostResponse>("/posts", { method: "POST", body: payload, auth: true }),
  // ... 5 more methods
};
```

The `apiFetch` wrapper handles:
- Automatic JWT token injection from cookies or localStorage
- 401 → token refresh → retry (with debounce to avoid race conditions)
- Error normalization (FastAPI validation errors → user-friendly messages)
- Request deduplication for concurrent identical calls

### Server Components vs. Client Components

Every page is a server component by default. Client components are only used where interactivity is required:

- **PostEditor** — markdown editing, tag selection, image upload
- **RagChat** — streaming chat UI with source citations
- **KnowledgeGraph** — D3.js force-directed graph
- **Navbar/CommandPalette** — user menu, keyboard shortcuts
- **ThemeToggle** — dark/light mode

Server components handle everything else: data fetching, metadata generation, static params, and most of the rendering. This means the average page ships ~40KB of HTML instead of ~200KB of JavaScript.

---

## Shared Infrastructure — `nginx/`

```nginx
server {
    listen 80;
    server_name madhudadi.in;

    location /blog {
        proxy_pass http://frontend:3000;
        proxy_set_header Host $host;
    }

    location /api/ {
        proxy_pass http://backend:8000;
        proxy_set_header Host $host;
    }
}
```

Nginx handles:
- **Routing** — `/blog` → Next.js frontend, `/api` → FastAPI backend
- **Caching** — static assets (CSS, JS, images) cached for 1 year with hashed filenames
- **Compression** — brotli for modern browsers, gzip fallback
- **Security headers** — HSTS, X-Content-Type-Options, CSP, Referrer-Policy
- **Cloudflare integration** — real IP headers, cache purging via API

The key insight: Nginx is the only entry point. There's no Kubernetes ingress, no cloud load balancer, no API gateway. One Nginx config handles everything.

---

## What I'd Change

Looking back, there are three things I'd do differently:

**1. Use a task queue from day one.** The Redis pub/sub approach works, but it's not durable. If the app crashes mid-job, the task is lost. A proper queue (ARQ, Celery with Redis broker) would give retries, dead-letter queues, and job persistence.

**2. Split the admin router.** `app/routers/admin.py` at 500+ lines handles too many concerns. Post CRUD, user management, analytics queries, and task management should be separate routers. The file grew organically and never got refactored.

**3. Add OpenAPI types to the frontend.** The API client in `src/lib/api.ts` is manually typed. There's no code generation from the FastAPI OpenAPI schema. This means when the backend adds a field, the frontend type needs a manual update. Using `openapi-typescript` or `orval` would eliminate this class of bugs.

---

## What's Next

In the next post, I'll dive into the RAG chat system — how embeddings are generated, how hybrid search works, and how the streaming response pipeline is built.

---

*Built with FastAPI, Next.js 16, PostgreSQL, Redis, and zero third-party CMS. Deployed on a $12/month VPS.*