RAG
Every post tagged "RAG" · articles, case studies, guides.
- 01→
RAG's three failure modes (and the diagnostic table)
Three failure modes, one table. 30 minutes of diagnosis, then you know what to fix. Stop guessing.
AI solutions - 02→
Build an LLM Eval Harness in 200 Lines of TS
Frameworks are great until they get in the way. Here is a 200-line TS eval harness that runs in CI, blocks regressions and prints a diff.
AI solutions - 03→
pgvector at 10M+ rows: index, queries, real numbers
pgvector at 10M rows is not scary · if you pick the right index. HNSW vs IVFFlat, filter patterns, real numbers.
AI solutions · Websites, web apps & online shops - 04→
LLM prompt caching in production · a 60-80% cost cut
Prompt caching is the single biggest LLM cost lever in 2026. 4 patterns, real savings numbers, 2 gotchas worth knowing.
AI solutions - 05→
LLM evals-as-code · the CI gate we run on every RAG deploy
An eval that's not in CI is not an eval. Here's the evals-as-code workflow we run on every RAG project.
AI solutions - 06→
How to ship a production AI chatbot in 14 days
Fourteen days from zero to a live AI chatbot your company can actually use. The schedule we follow on every client project, down to what happens on each day.
AI solutions - 07→
Shipping AI agents that actually work in production
From demo to live system: the retrieval, eval, guardrails and cost control we run on every AI project we ship.
AI solutions - 08→
Picking a vector DB in 2026: pgvector, Pinecone, Weaviate
Three serious vector DBs, three very different DNA. Here's the decision framework that held up across our 2026 projects.
AI solutions · Websites, web apps & online shops
Liked what you saw? Let's build yours.
Short email or a 30-min call · 24h reply.
Start a project