
Building a RAG Chat System: From Zero to Production
Building a production RAG chat system from scratch using OpenAI embeddings, pgvector HNSW indexing, and hybrid search. Explains how the Ask AI page works: OpenAI text-embedding-3-small generates 1536-dimensional embeddings stored in PostgreSQL with pgvector and HNSW indexes for fast approximate nearest-neighbor search, combined with PostgreSQL tsvector full-text search using weighted BM25 scoring, relevance thresholds for preventing hallucination, source-grounded citations showing which post each answer is from, and the complete API pipeline from query to streaming SSE response with session management and rate limiting.
