solanodz

Blog

Notes on AI engineering, product systems, and the work behind taking prototypes into production.

Observability for AI Systems: Tracing, Evals, and Feedback Loops

Jun 13, 2026

AI-powered systems need a different kind of observability than traditional software. This guide covers tracing LLM calls, tracking evaluation scores over time, and building feedback loops that let you improve continuously.

AI engineeringObservabilityLLMTracingProduction
How AI Tools Actually Work

Jun 13, 2026

Models do not read files or run commands on their own. They ask for tools, and a harness runs them.

AIToolsAgentsTooling
LLM Evaluation in Practice: Beyond MMLU

Jun 12, 2026

MMLU tells you nothing about whether your LLM-powered application works. This is a practical guide to evaluating LLM pipelines for production — harness-based benchmarks, LLM-as-judge, and the signals that actually matter.

LLMEvaluationBenchmarksAI engineeringQuality
RAG Is Not Enough: The Pipeline Problems Nobody Talks About

Jun 11, 2026

Embedding and retrieving is the easy part. The real failure modes in production RAG systems are chunking strategy, retrieval quality, re-ranking, and evaluation — and most teams discover them too late.

RAGAIEmbeddingsVector searchNLP
Building Reliable MCP Servers in Production

Jun 10, 2026

A practical guide to designing MCP servers that behave predictably under load — schema validation, error handling, observability, and the tool shapes that actually hold up in production.

MCPAITool callingProductionSchema
Agentic AI: What Actually Breaks in Production

Jun 9, 2026

Agent demos look autonomous. Production agents fail in predictable ways: tool loops, bad recovery, memory drift, weak permissions, missing observability, and workflows with no clear stop condition.

AIAgentsProductionAutomation
LLM Evaluation in Practice: Beyond MMLU

Jun 9, 2026

Academic benchmarks tell you whether a model is generally capable. Product evals tell you whether your AI application works for your users, data, tools, and failure cases.

AIEvalsLLMsQuality
Building Reliable MCP Servers in Production

Jun 9, 2026

MCP servers are easy to demo and surprisingly easy to break in production. Reliability comes from narrow tools, schema validation, explicit permissions, timeouts, idempotency, and observability.

AIMCPProductionTooling
MCP Servers: A Practical Introduction

Jun 6, 2026

A practical guide to Model Context Protocol servers: what they expose, how clients call them, and how to design tools that stay predictable.

AIMCPAgentsTooling