Programa do Curso

Introduction and Diagnostic Foundations

  • Overview of failure modes in LLM systems and common Ollama-specific issues
  • Establishing reproducible experiments and controlled environments
  • Debugging toolset: local logs, request/response captures, and sandboxing

Reproducing and Isolating Failures

  • Techniques for creating minimal failing examples and seeds
  • Stateful vs stateless interactions: isolating context-related bugs
  • Determinism, randomness, and controlling nondeterministic behavior

Behavioral Evaluation and Metrics

  • Quantitative metrics: accuracy, ROUGE/BLEU variants, calibration, and perplexity proxies
  • Qualitative evaluations: human-in-the-loop scoring and rubric design
  • Task-specific fidelity checks and acceptance criteria

Automated Testing and Regression

  • Unit tests for prompts and components, scenario and end-to-end tests
  • Creating regression suites and golden example baselines
  • CI/CD integration for Ollama model updates and automated validation gates

Observability and Monitoring

  • Structured logging, distributed traces, and correlation IDs
  • Key operational metrics: latency, token usage, error rates, and quality signals
  • Alerting, dashboards, and SLIs/SLOs for model-backed services

Advanced Root Cause Analysis

  • Tracing through graphed prompts, tool calls, and multi-turn flows
  • Comparative A/B diagnosis and ablation studies
  • Data provenance, dataset debugging, and addressing dataset-induced failures

Safety, Robustness, and Remediation Strategies

  • Mitigations: filtering, grounding, retrieval augmentation, and prompt scaffolding
  • Rollback, canary, and phased rollout patterns for model updates
  • Post-mortems, lessons learned, and continuous improvement loops

Summary and Next Steps

Requisitos

  • Strong experience building and deploying LLM applications
  • Familiarity with Ollama workflows and model hosting
  • Comfort with Python, Docker, and basic observability tooling

Audience

  • AI engineers
  • ML Ops professionals
  • QA teams responsible for production LLM systems
 35 Horas

Próximas Formações Provisórias

Categorias Relacionadas