Get in Touch

Course Outline

Introduction and Diagnostic Foundations

  • Overview of failure modes in LLM systems and common Ollama-specific issues.
  • Establishing reproducible experiments and controlled environments.
  • Debugging tools: local logs, request/response captures, and sandboxing.

Reproducing and Isolating Failures

  • Techniques for creating minimal failing examples and seeds.
  • Stateful vs. stateless interactions: isolating context-related bugs.
  • Managing determinism, randomness, and controlling nondeterministic behavior.

Behavioral Evaluation and Metrics

  • Quantitative metrics: accuracy, ROUGE/BLEU variants, calibration, and perplexity proxies.
  • Qualitative evaluations: human-in-the-loop scoring and rubric design.
  • Task-specific fidelity checks and acceptance criteria.

Automated Testing and Regression

  • Unit tests for prompts and components, scenario and end-to-end tests.
  • Creating regression suites and golden example baselines.
  • CI/CD integration for Ollama model updates and automated validation gates.

Observability and Monitoring

  • Structured logging, distributed traces, and correlation IDs.
  • Key operational metrics: latency, token usage, error rates, and quality signals.
  • Alerting, dashboards, and SLIs/SLOs for model-backed services.

Advanced Root Cause Analysis

  • Tracing through graphed prompts, tool calls, and multi-turn flows.
  • Comparative A/B diagnosis and ablation studies.
  • Data provenance, dataset debugging, and addressing dataset-induced failures.

Safety, Robustness, and Remediation Strategies

  • Mitigations: filtering, grounding, retrieval augmentation, and prompt scaffolding.
  • Rollback, canary, and phased rollout patterns for model updates.
  • Post-mortems, lessons learned, and continuous improvement loops.

Summary and Next Steps

Requirements

  • Extensive experience in building and deploying LLM applications.
  • Familiarity with Ollama workflows and model hosting.
  • Proficiency with Python, Docker, and basic observability tools.

Audience

  • AI engineers.
  • MLOps professionals.
  • QA teams responsible for production LLM systems.
 35 Hours

Number of participants


Price per participant

Upcoming Courses

Related Categories