chigaros.com / guide

Learning LLMs & AI Agents

A Curated Field Guide β€” Edition v1.0.0 β€” March 3, 2026 β€” Living Document


Preface

This document is a version-controlled reference manual and curriculum architecture for understanding, building, and deploying Large Language Models (LLMs) and Agentic architectures. It is designed to be maintainable, verifiable, and structurally durable.

β—† Living Document

This is Edition 1.0.0. As the field accelerates, individual resources will deprecate rapidly. This guide relies on a stable ID system (e.g., VID-001, PAP-003) so sources can be hot-swapped in the Master Source Catalog without breaking the underlying curriculum.

Who this is for: Engineers, technical product managers, and researchers transitioning into AI engineering. It assumes fundamental programming knowledge but no prior deep learning expertise.


How to Use This Guide

Start with a goal. Find your row in the matrix below, then follow the path. Resource IDs (e.g., VID-001) correspond to entries in Appendix A.

Goal Start Here Then Advanced Band
Understand LLMs VID-001 BOK-001 VID-002 COU-012 BOK-002 BOK-003 Beginner
Master Prompting REP-003 PAP-004 PAP-005 PAP-006 Beginner
Build first agent GUI-001 GUI-005 VID-005 REP-002 COU-001 VID-007 Intermediate
Master RAG PAP-007 COU-003 COU-004 COU-006 Intermediate
Multi-agent systems PAP-002 VID-003 COU-011 COU-014 REP-001 GUI-002 Advanced
Evaluate & Deploy COU-009 VID-004 COU-008 REP-010 GUI-003 REP-008 Advanced

Learning Map

  1. Mental Models β€” What LLMs are
  2. Prompting as Interface Design
  3. Tool Use & Structured Outputs
  4. RAG Systems
  5. Agents & Agentic Architectures
  6. Multi-Agent Systems & Memory
  7. Evaluation & Reliability
  8. LLMOps & Deployment
  9. Security & Prompt Injection
  10. Capstone Builds

Starter Glossary

Agent

An AI system that uses an LLM as its reasoning engine to determine which actions to take and in what order, often interacting with external tools.

LLM

A deep learning model trained on vast text to predict the next token, enabling it to generate human-like text and perform reasoning tasks.

RAG

Retrieval-Augmented Generation β€” grounds an LLM's responses in external knowledge retrieved at runtime.

MCP

Model Context Protocol β€” an open standard enabling secure two-way connections between data sources and AI models.

Embeddings

Numerical representations of text in a high-dimensional vector space, capturing semantic meaning.

β†’ Full glossary in Appendix B


Foundations

🟒 Beginner

1.1 β€” The Mental Model of Large Language Models

Learning Objectives

  • Understand the fundamental mechanism of next-token prediction
  • Differentiate between base models and instruction-tuned models
  • Recognize hardware and data requirements for training
β—† Definition

A Large Language Model is a neural network, typically based on the Transformer architecture, optimized to predict the most probable subsequent token given a sequence of preceding tokens.

Core Ideas: LLMs are not knowledge databases β€” they are statistical reasoning engines. Their primary capability is pattern recognition and generation. The shift from base models (which simply continue text) to chat models (which follow instructions) is achieved through Fine-Tuning and Reinforcement Learning from Human Feedback (RLHF).

β–  Do This Now

Access a base model and an instruction-tuned model via API or open-source weights. Feed both the same prompt: "The capital of France is". Observe how the base model continues the sentence, while the tuned model answers the implied question.

β–² Common Pitfall

Treating the LLM as a factual lookup engine. Detection cue: You are surprised when the model fabricates a plausible-sounding but incorrect URL or historical date.

βœ“ Self-Check
  • I can explain next-token prediction to a non-technical peer
  • I understand the difference between pre-training and fine-tuning
  • I know why hallucinations are a feature of the architecture, not a bug

Sources


Prompting & Interaction Design

🟒 Beginner

2.1 β€” Structuring Context and Reasoning

Learning Objectives

  • Master zero-shot and few-shot prompting techniques
  • Implement Chain-of-Thought reasoning to improve output reliability
  • Design prompts as deterministic interfaces
β—† Definition

Prompt Engineering is the systematic process of designing, structuring, and optimizing inputs to an LLM to elicit accurate, structured, and predictable outputs.

Core Ideas: Models perform better when given space to "think." Techniques like Chain-of-Thought (CoT) force the model to output intermediate reasoning steps, vastly reducing logic errors.

β–  Do This Now

Write a prompt asking an LLM to solve a logic puzzle. First, ask for just the answer. Second, append "Think step-by-step before answering." Compare reliability.

β–² Common Pitfall

Writing polite, conversational requests instead of structured commands. Detection cue: Your prompts include "Could you please..." instead of structured <instruction> tags.

βœ“ Self-Check
  • I separate instructions from data using delimiters (XML tags)
  • I provide few-shot examples for complex formatting tasks
  • I utilize Chain-of-Thought for tasks requiring logic or math

Sources


Retrieval-Augmented Generation

🟑 Intermediate

3.1 β€” Grounding Models in External Data

Learning Objectives

  • Understand the architecture of a standard RAG pipeline
  • Generate embeddings and store them in a vector database
  • Execute semantic search to retrieve context for an LLM
β—† Definition

RAG is a framework that retrieves relevant facts from an external knowledge base to ground large language models on the most accurate, up-to-date information before generating an answer.

Core Ideas: Models are frozen in time. RAG solves this by passing relevant documents into the context window at runtime. The process: Chunk β†’ Embed β†’ Index β†’ Retrieve β†’ Generate.

β–  Do This Now

Take a long PDF. Chunk it into 500-token segments, generate embeddings for each, then write a script to find the most similar chunk to a user query using cosine similarity.

β–² Common Pitfall

Using overly large chunk sizes. Detection cue: The retrieved context contains the answer but the LLM ignores it β€” "Lost in the Middle" phenomenon.

βœ“ Self-Check
  • I can explain the difference between lexical and semantic search
  • I understand how chunking strategies impact retrieval quality
  • I know how to calculate cosine similarity between two vectors

Sources


Agents & Tool Orchestration

🟑 Intermediate

4.1 β€” The Agentic Paradigm

Learning Objectives

  • Define what makes a system "agentic"
  • Implement the ReAct (Reason + Act) pattern
  • Provide an LLM with external tools β€” APIs, calculators, code interpreters
β—† Definition

An Agent is a system where an LLM is given an objective, a set of tools, and a loop allowing it to independently reason, execute actions, observe results, and iterate until the objective is met.

Core Ideas: The ReAct pattern is foundational β€” the model Reasons about what to do, chooses an Action (tool call), Observes the output, and loops. MCP standardizes how agents connect to external data.

β–  Do This Now

Build a basic ReAct loop in Python without an agent framework. Define one tool (e.g., current weather function). Write a while loop that prompts the LLM, parses its output for a tool call, executes the function, and feeds the result back.

β–² Common Pitfall

Overloading the agent with too many tools. Detection cue: The model enters an infinite loop, continuously calling the wrong tool or hallucinating tool arguments.

βœ“ Self-Check
  • I understand the ReAct framework
  • I can write a system prompt defining available tools and their schemas
  • I understand the principles behind the Model Context Protocol (MCP)

Sources


Multi-Agent Systems & Memory

πŸ”΄ Advanced

5.1 β€” Collaborative AI Architecture

Learning Objectives

  • Design workflows requiring multiple specialized agents
  • Implement memory systems β€” short-term context vs. long-term vector storage
  • Structure agent-to-agent communication
β—† Definition

Multi-Agent Systems (MAS) orchestrate multiple distinct AI agents, each with specific roles, system prompts, and tools, collaborating to solve tasks too complex for a single agent.

β–  Do This Now

Design an architecture diagram for an automated software development team. Map out roles (PM, Coder, Reviewer), the specific tools each role needs, and the flow of information between them.

β–² Common Pitfall

Lack of termination criteria. Detection cue: Agents get stuck in a conversational loop, endlessly agreeing or passing the same data back without progressing the task.

βœ“ Self-Check
  • I can identify when a task requires multi-agent vs. single-agent
  • I understand how to implement an orchestrator/router pattern
  • I can manage conversational state without exceeding context window limits

Sources


Evaluation & LLMOps

πŸ”΄ Advanced

6.1 β€” Reliability and Productionization

Learning Objectives

  • Implement "LLM-as-a-Judge" evaluation metrics
  • Design logging and tracing systems for agent workflows
  • Manage prompt versioning and regression testing
β—† Definition

LLMOps comprises the operational capabilities, infrastructure, and practices required to manage the lifecycle of LLM applications, from prompt engineering to production deployment and monitoring.

β–  Do This Now

Create an evaluation dataset of 10 queries and 10 ideal responses. Write a script that passes your system's outputs to an "evaluator model," prompting it to score 1–5 based on accuracy and conciseness.

β–² Common Pitfall

Deploying straight from a playground to production. Detection cue: You have no visibility into the actual prompts your system generates on behalf of users, or the latency of external tool calls.

βœ“ Self-Check
  • I have baseline metrics for my task (retrieval precision, generation accuracy)
  • I have implemented tracing to monitor token usage and cost
  • I have automated evaluation pipelines before pushing prompt changes

Sources


Appendix A β€” Master Source Catalog

Single source of truth. All additions, deprecations, and updates happen here first.

β—† Editor's Note

Short links (lnkd.in) require manual verification. Mark any unresolved links ⚠️ Verify in the Status column after each patch update. See Appendix D for protocol.

ID Title Type Author Difficulty Time Status URL
VID-001LLM IntroductionVideoUnknownBeginner1hCore↗
VID-002LLMs from ScratchVideoUnknownAdvanced2hCore↗
VID-003Agentic AI Overview (Stanford)VideoStanfordIntermediate1hCore↗
VID-004Building and Evaluating AgentsVideoUnknownAdvanced1hCore↗
VID-005Building Effective AgentsVideoUnknownIntermediate1hCore↗
VID-006Building Agents with MCPVideoUnknownIntermediate1hCore↗
VID-007Building an Agent from ScratchVideoUnknownIntermediate1hCore↗
VID-008Philo Agents (Playlist)VideoUnknownIntermediate2hOptional↗
REP-001GenAI AgentsRepoNirdiamantIntermediate—Core↗
REP-002AI Agents for BeginnersRepoMicrosoftBeginner—Core↗
REP-003Prompt Engineering GuideRepoUnknownBeginner4hCoreβ†— ⚠️
GUI-001Google's Agent WhitepaperGuideGoogleIntermediate1hCoreβ†— ⚠️
GUI-002Google's Agent CompanionGuideGoogleIntermediate1hCoreβ†— ⚠️
GUI-003Building Effective Agents β€” AnthropicGuideAnthropicIntermediate1hCoreβ†— ⚠️
GUI-004Claude Code Best Agentic PracticesGuideAnthropicIntermediate1hCoreβ†— ⚠️
GUI-005OpenAI's Practical Guide to Building AgentsGuideOpenAIIntermediate1hCoreβ†— ⚠️
BOK-001Understanding Deep LearningBookUnknownIntermediate20h+Core↗
BOK-002Building an LLM from ScratchBookUnknownAdvanced15h+Coreβ†— ⚠️
BOK-003The LLM Engineering HandbookBookUnknownAdvanced15h+Coreβ†— ⚠️
BOK-004AI Agents: The Definitive GuideBookNicole KoenigsteinIntermediate10h+Coreβ†— ⚠️
BOK-005Building Applications with AI AgentsBookMichael AlbadaIntermediate10h+Optionalβ†— ⚠️
BOK-006AI Agents with MCPBookKyle StratisIntermediate10h+Optionalβ†— ⚠️
BOK-007AI Engineering — O'ReillyBookO'ReillyAdvanced15h+Core↗
PAP-001ReAct: Synergizing Reasoning and ActingPaperUnknownAdvanced2hCoreβ†— ⚠️
PAP-002Generative AgentsPaperUnknownAdvanced2hCoreβ†— ⚠️
PAP-003ToolformerPaperUnknownAdvanced2hCoreβ†— ⚠️
PAP-004Chain-of-Thought PromptingPaperUnknownIntermediate1hCoreβ†— ⚠️
PAP-005Tree of ThoughtsPaperUnknownAdvanced2hCoreβ†— ⚠️
PAP-006ReflexionPaperUnknownAdvanced2hCoreβ†— ⚠️
PAP-007RAG SurveyPaperUnknownIntermediate2hCoreβ†— ⚠️
COU-001HuggingFace Agent CourseCourseHuggingFaceIntermediate8hCoreβ†— ⚠️
COU-002MCP with AnthropicCourseAnthropicIntermediate4hCoreβ†— ⚠️
COU-003Building Vector DBs with PineconeCoursePineconeIntermediate5hCoreβ†— ⚠️
COU-004Vector DBs from Embeddings to AppsCourseUnknownIntermediate5hCoreβ†— ⚠️
COU-005Agent MemoryCourseUnknownIntermediate3hCoreβ†— ⚠️
COU-006Building and Evaluating RAG AppsCourseUnknownAdvanced5hCoreβ†— ⚠️
COU-007Building Browser AgentsCourseUnknownAdvanced4hOptionalβ†— ⚠️
COU-008LLMOpsCourseUnknownAdvanced6hCoreβ†— ⚠️
COU-009Evaluating AI AgentsCourseUnknownAdvanced4hCoreβ†— ⚠️
COU-010Computer Use with AnthropicCourseAnthropicAdvanced4hOptionalβ†— ⚠️
COU-011Multi-Agent UseCourseUnknownAdvanced4hCoreβ†— ⚠️
COU-012Improving LLM AccuracyCourseUnknownIntermediate4hCoreβ†— ⚠️
COU-013Agent Design PatternsCourseUnknownAdvanced5hCoreβ†— ⚠️
COU-014Multi Agent SystemsCourseUnknownAdvanced4hCoreβ†— ⚠️
NEW-001Gradient AscentNewsletterUnknownβ€”β€”Watchlistβ†— ⚠️
NEW-002DecodingML by PaulNewsletterPaulβ€”β€”Watchlistβ†— ⚠️
NEW-003Deep (Learning) Focus by CameronNewsletterCameronβ€”β€”Watchlistβ†— ⚠️
NEW-004NeoSage by ShivaniNewsletterShivani——Watchlist↗
NEW-005Jam with AINewsletterShirin & Shantanuβ€”β€”Watchlistβ†— ⚠️
NEW-006Data Hustle by SaiNewsletterSaiβ€”β€”Watchlistβ†— ⚠️

⚠️ = Short link pending full URL verification. Resolve in next patch update.


Appendix B β€” Full Glossary

Agent

An AI system that uses an LLM to dynamically determine a sequence of actions, often utilizing external tools to achieve a goal.

Chain-of-Thought (CoT)

A prompting technique that instructs the model to generate intermediate reasoning steps before arriving at a final answer, significantly improving logic performance.

Context Window

The maximum number of tokens an LLM can process in a single prompt-response interaction.

Embeddings

High-dimensional vector representations of text. Text with similar semantic meaning will have vectors located close together in space.

Fine-tuning

Taking a pre-trained base model and training it further on a smaller, specific dataset to specialize its behavior or improve instruction adherence.

Hallucination

When an LLM generates a response that sounds plausible but is factually incorrect or unsupported by its training data or context.

LLM

A massive neural network trained on vast text corpora to predict next-token probabilities.

LLMOps

The practices and tools used to deploy, manage, evaluate, and scale LLM applications in production reliably.

MCP (Model Context Protocol)

An open standard protocol designed to securely connect AI models with external data sources and tools.

Prompt Engineering

The iterative process of structuring text input to effectively communicate with and guide the outputs of generative models.

Prompt Injection

A security vulnerability where malicious input overrides system prompt instructions, causing the model to execute unintended behaviors.

RAG

Retrieval-Augmented Generation β€” grounding an LLM on external knowledge retrieved dynamically from a database to prevent hallucinations and access real-time data.

ReAct

A foundational agent architecture combining "Reasoning" (thinking about what to do) and "Acting" (using a tool), iteratively.

Reflexion

A framework where an agent evaluates its own past actions and outcomes, generating verbal reinforcement to improve future attempts.

Tool Use / Function Calling

The ability of an LLM to output structured data specifying a function name and arguments, allowing the system to trigger external software.

Tree of Thoughts (ToT)

An advanced reasoning technique extending CoT by allowing the model to explore multiple reasoning paths concurrently, evaluating and pruning them.

Vector Database

A specialized database optimized for storing, managing, and performing similarity searches on embedding vectors.


Appendix C β€” Changelog

v1.0.0

Initial release. Defined 6 core learning chapters. Established Tag Taxonomy. Imported 50+ resources into catalog. Short links pending full metadata resolution.

β—† Template for future entries
vX.X.X β€” [Date]
- [Added/Removed/Updated] [ID] β€” [Reason]
- [Structural changes, if any]

Appendix D β€” Editor's Protocol

Adding a Resource

  1. Ensure the resource fits exactly one primary Tag from the taxonomy
  2. Generate a sequential stable ID (e.g., if last repo is REP-012, new is REP-013)
  3. Validate the URL resolves and note paywall status
  4. Create the Per-Resource Entry Block and assign to the correct Chapter
  5. Add a row to Appendix A (Master Source Catalog)
  6. Log the addition in Appendix C with a patch version bump (v1.0.x)

Deprecating a Resource

  1. Do not delete from Appendix A β€” change Status to Deprecated
  2. Add a note pointing to the replacement ID (e.g., "Replaced by PAP-008")
  3. Remove the Entry Block from the active Chapter body
  4. Log in Appendix C

Versioning Principles

Version BumpTrigger
v2.0.0Major structural curriculum shift
v1.1.0Adding or removing Core resources
v1.0.1Fixing links, typos, metadata

Appendix E β€” Style Guide & Tag Taxonomy

Tag Taxonomy

Use only these exact strings:

foundationspromptingagentsRAGevaluationMCPembeddingsvector-databaseLLMOpssafetymulti-agentfine-tuningtheorybeginner-friendlyarchitecturetools

Callout Markers

SymbolMeaningColor
β—†DefinitionBlue
β– ExerciseGreen
β–²PitfallAmber
βœ“ChecklistPurple
← Home Research β†’