LLM-driven systems that pursue a goal by interleaving reasoning, tool calls, and observations inside a loop — and that decide for themselves which step to take next.
A versioned contract between two pieces of software — endpoints, verbs, payload shapes, errors, and auth — that decouples a caller from an implementation.
Grounding LLM responses in chunks retrieved from an external corpus so the model reasons over real, citable sources instead of parametric memory alone.
Package-level reference for the Vercel AI SDK — streamText, generateObject, tool calling, structured output, and the multi-provider model interface.
Package-level reference for openai on npm — Chat Completions, the Responses API, streaming, tool calls, structured outputs, embeddings, and the v4→v5 migration.
Package-level reference for the autogen-agentchat / autogen-core / autogen-ext family on PyPI plus the legacy pyautogen — install, rename history, versioning, and alternatives.
Package-level reference for the crewai library on PyPI plus the crewai-tools companion — install, versioning, and multi-agent alternatives.
Package-level reference for DSPy on PyPI — the dspy / dspy-ai rename, install variants, version policy, and alternatives.
Package-level reference for google-genai (the current Gemini SDK) and its predecessor google-generativeai — install, auth, versioning, and alternatives.
Package-level reference for the guidance library on PyPI — install, LLM-provider extras, versioning, and alternatives like instructor and outlines.
Package-level reference for the langchain family on PyPI — install variants, partner packages, version churn, and alternatives.
Package-level reference for the langsmith SDK on PyPI — install, versioning, env-var setup, and observability alternatives.
Package-level reference for the sentence-transformers library on PyPI — install, transformers/torch deps, model registry, and embedding alternatives.
Package-level reference for the Hugging Face transformers library on PyPI — install extras, backend choice, versioning, and alternatives.
Side-by-side comparison of LangChain, LlamaIndex, AutoGen, CrewAI, Haystack, and Semantic Kernel for building LLM-powered applications and agent systems. Covers strengths, weaknesses, and when to pick each.
Build LLM programs in DSPy with declarative signatures, modules, and optimisers. Covers Predict, ChainOfThought, ReAct, BootstrapFewShot, COPRO, MIPRO, MIPROv2, and inference compilation.
Build production-grade LLM pipelines with Haystack 2.x. Covers components, the pipeline graph, indexing and querying, retrievers, generators, RAG patterns, and evaluation.
Build LLM-powered applications with Microsoft Semantic Kernel. Covers the kernel, plugins, prompt templates, planners, function calling, Kernel Memory, Python and .NET SDKs.
Prompt engineering patterns, RAG, evaluations, few-shot, chain-of-thought, and structured output — foundational techniques for extracting reliable, structured behavior from LLMs.
CoT prompting techniques — zero-shot CoT, few-shot CoT, self-consistency, tree of thoughts, and how reasoning models compare with prompted reasoning.
In-context learning techniques — example selection, format design, count tuning, dynamic retrieval of demonstrations, and pitfalls of few-shot prompting.
Reliable prompt structures for reasoning, extraction, classification, generation, extended thinking, and vision tasks with Claude.
End-to-end checklist and code for building reliable Retrieval-Augmented Generation pipelines — chunking, embedding, vector DBs, retrieval, and evaluation.
Hugging Face Transformers, LangChain, Google Gemini SDK, and LangSmith — practical reference for AI/ML frameworks and observability tools.
Claude Code, Codex CLI, the Claude API, and prompt engineering — practical reference for building with and using large language models.
Build multi-agent AI systems with Microsoft AutoGen. Covers agents, group chats, code execution, tool registration, async runtimes, and LLM configuration.
Orchestrate teams of role-playing AI agents with crewAI. Covers agents, tasks, crews, tools, LLM selection, memory, YAML config, and the kickoff lifecycle.
Call Google's Gemini models from Python for text, multimodal, streaming, chat, function calling, and embeddings. Covers the genai SDK, safety settings, file API, and async usage.
Interleave Python control flow with LLM generation and enforce structured output using guidance. Covers gen(), select(), chat blocks, regex constraints, JSON schemas, and token healing.
Build LLM-powered pipelines with LangChain. Covers LCEL chains, chat models, prompts, output parsers, tools, agents, retrievers, memory, and streaming.
Trace, debug, evaluate, and monitor LLM applications with LangSmith. Covers tracing setup, datasets, evaluators, prompt hub, comparing runs, and CI integration.
Build RAG pipelines and LLM-powered data applications with LlamaIndex. Covers document loading, indexing, query engines, custom LLMs and embeddings, persistent storage, and agents.
Measure and improve RAG pipeline quality with ragas. Covers faithfulness, answer relevancy, context precision, context recall, dataset format, LLM judges, and CI integration.
Load and run pre-trained models for NLP, vision, and audio with the Hugging Face Transformers library. Covers pipelines, AutoModel, tokenisation, generation, fine-tuning, and device placement.
Evaluate and monitor LLM applications with TruLens. Covers the RAG triad, feedback functions, TruChain, TruLlama, custom evaluators, the dashboard, and CI integration.
navigation
actions
cheat sheet pages